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This chapter deals with the relationship between logical definability and com- 
putational complexity on finite structures. Particular emphasis is given to 
game-based evaluation algorithms for various logical formalisms and to logics 
capturing complexity classes. 

In addition to the most common logical systems such as first-order and 
second-order logic (and their fragments), this survey focuses on algorithmic 
questions and complexity results related to fixed-point logics (including fixed- 
point extensions of first-order logic, the modal p-calculus, the database query 
language Datalog, and fixed-point logics with counting). 

Finally, it is discussed how the general approach and the methodology of 
finite model theory can be extended to suitable domains of infinite structures. 
As an example, some results relating metafinite model theory to complexity 
theory are presented. 


3.1 Definability and Complexity 


One of the central issues in finite model theory is the relationship between 
logical definability and computational complexity. We want to understand 
how the expressive power of a logical system — such as first-order or second- 
order logic, least fixed-point logic, or a logic-based database query language 
such as Datalog — is related to its algorithmic properties. Conversely, we want 
to relate natural levels of computational complexity to the defining power of 
logical languages, i.e., we want logics that capture complexity classes.' 

The aspects of finite model theory that are related to computational com- 
plexity are also referred to as descriptive complexity theory. While computa- 
tional complexity theory is concerned with the computational resources such 
as time, space, or the amount of hardware that are necessary to decide a 
property, descriptive complexity theory asks for the logical resources that 


1 For a potential application of such results, see Exercise 3.5.32. 
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are necessary to define it. In this chapter we shall give a survey of descriptive 
complexity theory. We shall assume that the reader is familiar with fundamen- 
tal notions of logic and complexity theory. Specifically we assume familiarity 
with first-order logic and with deterministic and non-deterministic complex- 
ity classes. See the appendix to this chapter for a brief survey on alternating 
complexity classes. 

In Sect. 3.1, we discuss some basic issues concerning the relationship be- 
tween logic and complexity, we introduce model-checking games, and we de- 
termine in a detailed way the complexity of first-order model checking. 

In Sect. 3.2, we make precise the notion of a logic capturing a complexity 
class. As our first capturing result, we prove Fagin’s Theorem, which says 
that existential second-order logic captures NP. In a limited scenario, namely 
for the domain of ordered structures, we then derive capturing results for a 
number of other complexity classes, including PTIME and LOGSPACE, by 
use of fragments of second-order logic (such as second-order Horn logic) and 
by extensions of first-order logic (such as transitive closure logics). 

Section 3.3 is devoted to fixed-point logics. These are probably the most 
important logics for finite model theory and also play an important role in 
many other fields of logic in computer science. We shall discuss many variants 
of fixed point logics, including least, inflationary and partial fixed point logic, 
the modal p-calculus, and the database query language Datalog. We shall 
explain model checking issues, capturing results for PTIME and PSPACE, 
and also discuss structural issues for these logics. 

In Sect. 3.4 we introduce logics with counting. One of the limitations of 
common logics on finite structures is an inability to count. By adding to 
first-order logic and, in particular, to fixed-point logic an explicit counting 
mechanism, one obtains powerful logics that come quite close to capturing 
PTIME. 

Section 3.5 is devoted to capturing results on certain specific domains of 
unordered structures, via a technique called canonization. While the general 
problem of whether there exists a logic capturing PTIME on all finite struc- 
tures is still open (and it is widely conjectured that no such logic exists), 
canonization permits us to find interesting domains of structures where fixed- 
point logic or fixed-point logic with counting can express all of PTIME. 

Finally, in Sect. 3.6 we discuss the extension of the general approach and 
methods of finite model theory to suitable domains of infinite structures, i.e., 
the generalization of finite model theory to an algorithmic model theory. We 
discuss several domains of infinite structures for which this approach makes 
sense, and then treat, as an example, the domain of metafinite structures, for 
which capturing results have been studied in some detail. 


3.1.1 Complexity Issues in Logic 


One of the central issues in the relationship between complexity theory and 
logic is the algorithmic complexity of the common reasoning tasks for a logic. 
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There are numerous such tasks, but most of them can be easily reduced to 
two (at least for logics with reasonable closure properties), namely satisfiabil- 
ity testing and model checking. The satisfiability problem for a logic L on 
a domain D of structures takes formulae y € L as inputs, and the question 
to be answered is whether there exists in D a model for y. Although satisfia- 
bility problems are of fundamental importance in many areas of logic and its 
applications, they do not really play a crucial role in finite model theory. Nev- 
ertheless, they are considered occasionally and, moreover, some of the central 
results of finite model theory have interesting connections with satisfiability 
problems. We shall point out some such relations later. 

On the other hand, model-checking problems occupy a central place in 
finite model theory. For a logic L and a domain D of (finite) structures, the 
model-checking problem asks, given a structure 2% € D and a formula 
w E€ L, whether it is the case that A | y. A closely related problem is 
formula evaluation (or query evaluation): given a structure 2 and a formula 
p(T) (with free variables T), the problem is to compute the relation defined by 
w on A, i.e. the set Y% := {@: A H W(a)}. Obviously, the evaluation problem 
for a formula with k free variables on a structure with n elements reduces to 
n! model-checking problems. 

Note that a model-checking problem has two inputs: a structure and a 
formula. We can measure the complexity in terms of both inputs, and this 
is what is commonly refered to as the combined complexity of the model- 
checking problem (for L and D). However, in many cases, one of the two 
inputs is fixed, and we measure the complexity only in terms of the other. 
If we fix the structure 21, then the model-checking problem for L on this 
structure amounts to deciding Thr (U) := {4Y € L : A H y}, the L-theory 
of 2. The complexity of this problem is called the expression complexity 
of the model-checking problem (for L on 2). For first-order logic (FO) and 
for monadic second-order logic (MSO) in particular, such problems have a 
long tradition in logic and numerous applications in many fields. Of even 
greater importance for finite model theory are model-checking problems for a 
fixed formula w, which amounts to deciding the model class of w inside D, 
Modp(w) := {A € D : A w}. Its complexity is the structure complexity 
or data complexity of the model-checking problem (for ~ on D). 

Besides the algorithmic analysis of logic problems, there is another aspect 
of logic and complexity that has become even more important for finite model 
theory, and which is really the central programme of descriptive complexity 
theory. The goal here is to characterize complexity from the point of view 
of logic (or, more precisely, model theory)? by providing, for each important 
complexity level, logical systems whose expressive power (on finite structures, 
or on a particular domain of finite structures) coincides precisely with that 


2 There also exist other logical approaches to complexity, based for instance on 
proof theory. Connections to the finite model theory approach exist, but the 
flavour is quite different. 
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complexity level. For a detailed definition, see Sect. 3.2. We shall see that 
there have been important successes in this programme, but that there also 
remain difficult problems that are still open. 


3.1.2 Model Checking for First-Order Logic 


We shall now discuss the problem of evaluating first-order formulae on finite 
structures using a game-based approach. Model-checking problems, for almost 
any logic, can be cast as strategy problems for appropriate model-checking 
games (also called Hintikka games).? With any formula Y(T), any structure 2 
(of the same vocabulary as w), and any tuple @ of elements of 2, we associate 
a model-checking game G(2l,1(@)). It is played by two players, Verifier 
and Falsifier. Verifier (sometimes also called Player 0, or J, or Eloise) tries to 
prove that 2 | w(@), whereas Falsifier (also called Player 1, or Y, or Abelard) 
tries to establish that the formula is false. For first-order logic, the evaluation 
games are very simple, in the sense that winning conditions are positional, and 
that the games are well-founded, i.e. all possible plays are finite (regardless 
of whether the input structure is finite or infinite). For more powerful logics, 
notably fixed-point logics, model checking-games may have infinite plays and 
more complicated winning conditions (see Sect. 3.3.4). 


The Game G(A, y(@)) 


Let 2 be a finite structure and let Y(T) be a relational first-order formula, 
which we assume to be in negation normal form, i.e. built up from atoms 
and negated atoms by means of the propositional connectives A, V and the 
quantifiers 4,V. Obviously, any first-order formula can be converted in linear 
time into an equivalent one in negation normal form. The model-checking 
game G(A,w(@)) has positions (p, p) such that y is a subformula of w, and 
p : free(y) — A is an assignment from the free variables of y to elements of 
A. To simplify the notation we usually write y(b) for a position (p, p) where p 
assigns the tuple b to the free variables of y. The initial position of the game 
is the formula w(@). 

Verifier (Player 0) moves from positions associated with disjunctions and 
with formulae starting with an existential quantifier. From a position y V ¥, 
she moves to either y or J. From a position Jyy(b, y), Verifier can move to any 
position (b,c), where c € A. Dually, Falsifier (Player 1) makes corresponding 
moves from conjunctions and universal quantifications. At atoms or negated 
atoms, i.e. positions (b) of the form b = b’, b Æ b', Rb, or —Rb, the game is 
over. Verifier has won the play if A y(b); otherwise, Falsifier has won. 

Model-checking games are a way of defining the semantics of a logic. The 
equivalence to the standard definition can be proved by a simple induction. 


3 These games should not be confounded with the games used for model comparison 
(Ehrenfeucht—Fraissé games) that describe the power of a logic for distinguishing 
between two structures. 
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Proposition 3.1.1. Verifier has a winning strategy for the game G(Q, y (T)) 
if, and only if, AE WG). 


This suggests a game-based approach to model checking: given 2 and 4, 
construct the game G(2, y) and decide whether Verifier has a winning strat- 
egy from the initial position. Let us therefore look a little closer at strategy 
problems for games. 


3.1.3 The Strategy Problem for Finite Games 


Abstractly, we can describe a two-player game with positional winning con- 
ditions by a directed game graph G = (V, Vo, V1, E), with a partioning 
V = V U Vi of the nodes into positions where Player 0 moves and positions 
where Player 1 moves. The possible moves are described by the edge relation 
E C V x V. We call w a successor of v if (v,w) € E, and we denote the 
set of all successors of v by vE. To decribe the winning conditions, we adopt 
the convention that Player o loses at positions v € V, where no moves are 
possible. (Alternatively, one could explicitly include in the game description 
the sets So, Sı of winning terminal positions for each player.) 

A play of G is a path vo, v1,... formed by the two players starting from a 
given position vp. Whenever the current position uv, belongs to Vs, Player o 
chooses a move to a successor Un+1 E UnE; if no move is available, then 
Player o has lost the play. If this never occurs, the play goes on infinitely and 
the winner has to be established by a winning condition on infinite plays. For 
the moment, let us say that infinite plays are won by neither of the players.* 

A strategy for a player is a function defining a move for each situation in 
a play where she has to move. Of particular interest are positional strategies, 
which do not depend on the history of the play, but only on the current 
position. Hence, a positional strategy for Player o in G is a (partial) function 
f : Vo — V which indicates a choice (v, f(v)) € E for positions v € Vz. 
A play vo,v1,... is consistent with a positional strategy f for Player o if 
Un+1 = f(vn) for all v, € Vs. A strategy for a player is winning from position 
vo if she wins every play starting from vo that is consistent with that strategy. 
We say that a strategy is winning on a set W if it is winning from each position 
in W. The winning region W, for Player ø is the set of positions from which 
she has a winning strategy. 

A game is well-founded if all its plays are finite. Note that a model- 
checking game G(2l,(@)) for a first-order formula w has a finite game graph 
if, and only if, 2 is finite, but it is well-founded in all cases. In general, however, 
games with finite game graphs need not be well-founded. 

A game is determined if, from each position, one of the players has 
a winning strategy, i.e. if Wo UW, = V. Well-founded games are always 


4 We shall later introduce games with more interesting winning conditions for infi- 
nite plays. 
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determined, and so are large classes of more general games (such as games in 
the Borel hierarchy; see [82, 96]). 

We denote by GAME the strategy problem for games with finite game 
graphs and positional winning conditions, i.e. 


GAME = {(G,v) : Player 0 has a winning strategy in G from position v}. 


It is obvious that the GAME problem can be solved in polynomial time. Denote 
by W? the set of positions from which Player ø has a strategy to win the game 
in at most n moves. Then W2 = {v € Vi—o : VE = Ú} is the set of winning 
terminal positions for Player ø, and we can compute the sets WẸ inductively 
by using 


W+! = {v € VY : vE N WZ AOU{vE VY: vECW?} 


until W+! = WẸ. 


To see that GAME can actually be solved in linear time, a little more work 
is necessary. The following algorithm is a variant of depth-first search, and 
computes the entire winning sets for both players in time O(|V| + |E). 


Theorem 3.1.2. Winning regions of finite games can be computed in linear 
time. 


Proof. We present an algorithm that computes, for each position, which 
player, if any, has a winning strategy for the game starting at that position. 
During the computation three arrays are used: 


e win[v] contains either 0 or 1, indicating which player wins, or L if we do 
not know yet, or if none of the players has a winning strategy from v; 
P[v] contains the predecessors of v; and 
nfv] is the number of those successors for which win[v] = L. 


A linear-time algorithm for the GAME problem 


Input: A game G = (V, Vo, Vi, Æ) 


forall v € V do (x 1: initialization *) 
win[v] := L 
Plv| := 0 
n[v] := 0 

enddo 

forall (u,v) € E do (x 2: calculate P and n x) 
Plu] := Plo] U {u} 
nju] := n[u] + 1 
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forall v € Yo (x 3: calculate win *) 
if n[v] = 0 then Propagate(v, 1) 

forall v € Vi 
if n[v] = 0 then Propagate(v, 0) 

return win end 


procedure Propagate(v, o) 
if win[v] 4 L then return 


win[v] := o (x 4: mark v as winning for Player o *) 
forall u € P{v] do (x 5: propagate change to predecessors *) 
nu] := nf[u] — 1 
if u € V, or n[u] = 0 then Propagate(u, o) 
enddo 
end 


The heart of this algorithm is the procedure Propagate(v, o) which is called 
any time we have found that Player ø has a winning strategy from position v. 
Propagate(v, o) records this fact and investigates whether we are now able to 
determine the winning player for any of the predecessors of v. This is done by 
applying the following rules: 


e Ifthe predecessor u belongs to Player ø, then this player has a winning 
strategy from u by moving to position v. 

e If the predecessor u belongs to the opponent of Player ø, if win[u] is un- 
defined, and if the winning player has already been determined for all 
successors w of u, then win[w] = o for all of those successors, and hence 
Player o wins from u regardless of the choice of her opponent. 


Since parts 4 and 5 of the algorithm are reached only once for each posi- 
tion v, the inner part of the loop in part 5 is executed at most $, |P[v]| = |E| 
times. Therefore the running time of the algorithm is O(|V| + |E]). 

The correctness of the value assigned to win[v] is proved by a straightfor- 
ward induction on the number of moves in which the corresponding player can 
ensure that she wins. Note that the positions satisfying n[v] = 0 in part 3 are 
exactly those without outgoing edges even if n[v] is modified by Propagate. 


GAME is known to be a PTIME-complete problem (see [57]). This remains 
the case for strictly alternating games, where E C Vox V UV, x Vo. Indeed, 
any game can be transformed into an equivalent strictly alternating one by 
introducing for each move (u,v) € Vs x Vz a new node e € Vi_, and by 
replacing the move (u,v) by two moves (u,e) and (e, u). 

The GAME problem (sometimes also called the problem of alternating 
reachability) is a general combinatorial problem that reappears in different 
guises in many areas. To illustrate this by an example, we shall now show 
that the satisfiability problem for propositional Horn formulae is essentially 
the same problem as GAME. 
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It is well known that SAT-HORN, the satisfiability problem for propositional 
Horn formulae, is 


e PTIME-complete [57], and 
e solvable in linear time [36, 68]. 


Using the GAME problem, we can obtain very simple proofs for both re- 
sults. Indeed, GAME and SAT-HORN are equivalent under log—lin reductions, 
i.e. reductions that are computable in linear time and logarithmic space. The 
reductions are so simple that we can say that GAME and SAT-HORN are really 
the same problem. 


Theorem 3.1.3. SAT-HORN is log—lin equivalent to GAME. 


Proof. GAME <jog-lin SAT-HORN. Given a finite game graph G = 
(V, Vo, Va, E), we can construct in time O(|V| + |E|) a propositional Horn 
formula Wg consisting of the clauses u <— v for all edges (u,v) € E with 
u € Vo, and the clauses u — vı A--: A Um for all nodes u € Vi, where 
uE = {v1,...,Um}. The minimal model of wg is precisely the winning set Wo 
for Player 0. Hence v € Wo if the Horn formula wg A (0 — v) is unsatisfiable. 
SAT-HORN <jog-lin GAME: Given a Horn formula ~(X1,...,Xn) = 
Nie ,Ci with propositional variables X1,..., Xn and Horn clauses C; of the 
form H; — Xi, \---X;,, (where the head of the clause, H;, is either a propo- 
sitional variable or the constant 0), we define a game Gy as follows. The 
positions of Player 0 are the initial position 0 and the propositional variables 
Xj,...,Xn, and the positions of Player 1 are the clauses of =. Player 0 can 
move from a position X to any clause C; with head X, and Player 1 can 
move from a clause C; to any variable occurring in the body of C;. Formally, 
Gy = (V, E), V = Vo U Vi with Vo = {0} U{X1,..., Xn}, Vi = {Cis ie Th, 
and 


E = {(X,C) € Vo x Vı : X =head(C)} U {(C, X) € Vi x Vo: X € body(C)}. 


Player 0 has a winning strategy for Gy from position X if, and only if, y = X. 
In particular, Y is unsatisfiable if, and only if, Player 0 wins from position 0. 


3.1.4 Complexity of First-Order Model Checking 


Roughly, the size of the model-checking game G(2,~) is the number of dif- 
ferent instantiations of the subformulae of p with elements from 2. It is in 
many cases not efficient to construct the full model-checking game explicitly 
and then solve the strategy problem, since many positions of the game will 
not really be needed. 
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To measure the size of games, and the resulting time and space bounds for 
the complexity of model checking as precisely as possible, we use, besides the 
formula length ||, the following parameters. The closure cl(q) is the set of 
all subformulae of 7). Obviously, |cl(w)| < ||, and in some cases |cl(7)| can 
be much smaller than |w|. The quantifier rank qr(~) is the maximal nesting 
depth of quantifiers in =, and the width of w is the maximal number of free 
variables in subformulae, i.e. 


width(7) = max{|free(y)| : p € cl(w)}. 


Instead of considering the width, one can also rewrite formulae with as 
few variables as possible. 


Lemma 3.1.4. A first-order formula w has width k if, and only if, it is equiv- 
alent, via a renaming of bound variables, to a first-order formula with at most 
k distinct variable symbols. 


Bounded-variable fragments of logics have received a lot of attention in fi- 
nite model theory. However, here we state the results in terms of formula width 
rather than number of variables to avoid the necessity to economize on the 
number of variables. Given the close connection between games and alternat- 
ing algorithms, it is not surprising that the good estimates for the complexity 
of model-checking games are often in terms of alternating complexity classes. 
We now describe an alternating model-checking algorithm for first-order logic 
that can be viewed as an on-the-fly construction of the model-checking game 
while playing it. 


Theorem 3.1.5. There is an alternating model-checking algorithm that, given 
a finite structure A and a first-order sentence p, decides whether A = w in 
time O(\%| + qr(w) log |A|) and space O(log |p| + width(~) log |A|) (assuming 


that atomic statements are evaluated in constant time). 


Proof. We present a recursive alternating procedure ModelCheck(2, p, Y) 
that, given a finite structure 2, a first-order formula w that may contain free 
variables, and an assignment p : free(q) — A, decides whether A H [pl]. 


ModelCheck(, p, Y) 


Input: a first-order formula ~ in negation normal form 
a finite structure 2 (with universe A), 
an assignment p : free(y) > A 

if ọ% is an atom or negated atom then 
if A = wip] accept else reject 

if Y= nV ð then do 
guess  € {nņ, Ù}, and let p' := p |free(y) 
ModelCheck(2, p', p) 

if Y =nņ^ ù then do 
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universally choose ¢ € {7,0}, and let p' := p |free(y) 
ModelCheck(2, p’, p) 

if y = dary then do 

guess an element a of 2 

ModelCheck(2, p[z — a], p) 

if y = Vry then do 

universally choose an element a of 2 
ModelCheck(2, p[x — a], p) 


A straightforward induction shows that the procedure is correct. The time 
needed by the procedure is the depth of the syntax tree of w plus the time 
needed to produce the variable assignments. On each computation path, at 
most qr(w) elements of Xl have to be chosen, and each element needs log |A| 
bits. Hence the time complexity is O(|~| + qr(q) log |A|). During the eval- 
uation, the algorithm needs to maintain a pointer to the current position 
in w and to store the current assignment, which needs free(y) log |A| bits 
for the current subformula y. Hence the space needed by the algorithm is 
O(log || + width(w) log | Al). 


Theorem 3.1.6. The model-checking problem for first-order logic is 
PSPACE-complete. For any fixed k > 2, the model-checking problem for first- 
order formulae of width at most k is PTIME-complete. 


Proof. Membership of these complexity classes follows immediately from The- 
orem 3.1.5 via the facts that alternating polynomial time coincides with poly- 
nomial space and alternating logarithmic space coincides with polynomial 
time. 

Completeness follows by straightforward reductions from known complete 
problems. QBF, the evaluation problem for quantified Boolean formulae, is 
PSPACE-complete. It reduces to first-order model checking on the fixed struc- 
ture (A, P) with A = {0,1} and P = {1}. Given a quantified Boolean formula 
w without free propositional variables,we can translate it into a first-order 
sentence ~ as follows: replace every quantification 3X; or VX; over a proposi- 
tional variable X; by a corresponding first-order quantification dz; or Vx; and 
replace atomic propositions X; by atoms Px;. Obviously, ~ evaluates to true 
if, and only if, (A, P) = y’. This proves that the expression complexity and 
the combined complexity of first-order model checking is PSPACE-complete. 

To see that the model-checking problem for first-order formulae of width 2 
is PTIME-complete, we reduce to it the GAME problem for strictly alternating 
games, with Player 0 moving first. Given a strictly alternating game graph 
G = (V, Vo, Vi, E), we construct formulae w(x) of width 2, expressing the fact 
that Player 0 has a winning strategy from x € Vo in n rounds. Let 


pı lx) := dy(Exy AVznEyz) 
Jy(Exy ^A Yz(Eyz > %i(z)). 
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Obviously, Yn has width 2, and G = w,,(v) if, and only if, Player 0 can win 
from position v in at most n rounds. Now, if Player 0 has a winning strategy, 
then she also has one for winning in at most n rounds, where n = |V|, since 
otherwise the game will be caught in a loop. Hence any instance G,v of the 
GAME problem (for strictly alternating games), with v € Vo, can be reduced to 
the instance G, Yn(v) of the model-checking problem for first-order formulae 
of width 2. 


Remark. The argument for PTIME-completeness applies also in fact to 
propositional modal logic (ML) [55]. Instead of the formulae Yn (x£) constructed 
above, we take the modal formulae 


gi = OOfalse, Pn+1 = COYn. 


Corollary 3.1.7. The model-checking problem for ML is PTIME-complete. 


If we consider a fixed formula Y, Theorem 3.1.5 tells us that the data 
complexity of first-order logic is much lower than the expression or combined 
complexity. 


Corollary 3.1.8. Let w be a first-order sentence. Then 


LA: A finite, A = yY} € ALOGTIME. 


In particular, the evaluation problem for any fixed first-order sentence can be 
computed deterministically in logarithmic space. 


3.1.5 Encoding Finite Structures by Words 


Complexity theory, at least in its current form, is based on classical computa- 
tional models, most notably Turing machines, that take as inputs words over 
a fixed finite alphabet. If we want to measure the complexity of problems on 
finite structures in terms of these notions, we have to represent structures by 
words so that they can be used as inputs for, say, Turing machines. This may 
seem a trivial issue, and for purely algorithmic questions (say for determining 
the cost of a model-checking algorithm) it indeed often is. However, the pro- 
gramme of finite model theory is to link complexity with logical definability 
in a deeper way, and for this purpose the represention of structures by words 
needs careful consideration. It is also at the source of some major unresolved 
problems that we shall discuss later. 

At least implicitly, an encoding of a finite structure by a word requires 
that we select an ordered representation of the structure. To see this, consider 
the common encoding of a graph G = (V, E) by its adjacency matrix. Once we 
have fixed an enumeration of V, say V = {vo,...,Un—1}, we can represent the 
graph by the word wo: +: Wwn2—1, Where Win4; = Lif (vi, vj) € E and wing; = 0 
otherwise, i.e. row after row of the adjacency matrix. However, this encoding 
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is not canonic. There are n! possibilities of enumerating V, so there may be up 
to n! different encodings of the same graph by binary strings. But if the graphs 
come along with a linear order, we do have a canonic way of enumerating the 
elements and therefore a canonic encoding. Let us now discuss encodings of 
arbitrary finite structures (of finite vocabulary) by words. 


Definition 3.1.9. For any vocabulary 7, we write Fin(r) for the class of finite 
7-structures and Ord(r) for the class of all structures (21, <), where 2 € Fin(r) 
and < is a linear order on A (the universe of 2). 


For any structure (2, <) E€ Ord(r) of cardinality n and for any k, we can 
identify A* with the set {0,...,n* — 1}, by associating each k-tuple with its 
rank in the lexicographical ordering induced by < on A*. Ordered structures 
can be encoded as binary strings in many natural ways. The particular choice 
of an encoding is not important. We only need the following conditions to be 
satisfied. 


Definition 3.1.10. An encoding code: Ord(r) — X* (over any finite alpha- 
bet X) is good if it identifies isomorphic structures, if its values are poly- 
nomially bounded, if it is first-order definable, and if it allows to compute 
efficiently the values of atomic statements. Formally, this means that the fol- 
lowing conditions are satisfied: 


(i) code(A, <) = code(B, <) if and only if (2, <) S (8, <). 
(ii) |code(2l, <)| < p(|A]) for some polynomial p. 
(iii) For all k € N and all symbols ø € X, there exists a first-order formula 
Bolz1,..., £k) of vocabulary T U {<} such that, for all structures (2, < 
) € Ord(r) and all a € A*, the following equivalence holds: 


(A, <) = 8o (€) iff the G@th symbol of code(, <) is ø. 


(iv) Given code(A, <), a relation symbol R of 7, and (a representation of) a 
tuple q, one can efficiently decide whether 2% = Ra. 


The precise meaning of ‘efficiently’ in clause (iv) depends on the context 
(e.g. the problem that is studied, the machine model considered, and the level 
of abstraction at which one is studying a given problem). For the analysis 
of algorithms, one often assumes that atomic statements are evaluated in 
constant (or even unit) time on a Random Access Machine (RAM). A minimal 
requirement is that atoms can be evaluated in linear time and logarithmic 
space. 


A convenient encoding is given as follows. Let < be a linear order on 
A and let A = (A, Ri,...,R,) be a 7-structure of cardinality n. Let £ be 
the maximal arity of Rı,..., Re. With each relation R of arity j, we as- 
sociate a string y(R) = Wo +++ Wpi_10"-™ E {0,1}", where w; = 1 if 
the ith tuple of A’ belongs to R, and w; = 0 otherwise. Now, we set 
code(A, <) = 1°0"*-"y(R1) +++ x( Ry). 
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Exercise 3.1.11. Prove that this encoding is good. In fact, this encoding lends 
itself to a very simple logical description in the following sense: if, besides (or 
instead of) the linear ordering <, the corresponding successor relation S and 
the constants 0, e for the first and last elements with respect to < are available, 
then the encoding is definable by quantifier-free formulae 3,(Z). 


We can fix any good encoding function and understand ordered structures 
to be represented by their encodings. With an unordered structure 2, we 
associate the set of all encodings code(2l,<), where < is a linear order on A. 
So, when we say that an algorithm M decides a class K of 7-structures, we 
actually mean that M decides the set of encodings of structures in K, i.e. the 
language 


code(K) := {code(2, <) : A € K and < is a linear order on A}. 


It thus makes sense to ask whether such a K belongs to a complexity class, 
such as P or NP. In particular, we can ask how complicated it is to decide the 
class of models of a logical sentence. 


Word Structures 


We have seen how classes of structures are encoded by languages. On the other 
hand, any language L C I” can also be considered as a class of structures over 
the vocabulary {<}U{P,: a € I}. Indeed, a word w = wo...Wm-1 € I™ is 
described by the structure B(w) with universe {0,...,m— 1}, with the usual 
interpretation of < and where P, = {i : w; =a}. 


Isomorphism Invariance 


We have seen that encoding an unordered structure involves selecting an or- 
dering on the universe. In general, different orderings produce different en- 
codings. However, we want to consider properties of structures, not of their 
encodings, An algorithm that decides whether a structure has a certain prop- 
erty gets encodings code(2, <) as inputs and should produce the same answer 
(yes or no) for all encodings of the same structure. That is, the outcome of 
the algorithm should not depend on the particular ordered representation of 
the structure, but only on its isomorphism type. In other words the algorithm 
should be isomorphism-invariant. For most of the algorithms considered here 
isomorphism invariance is obvious, but in general it is an undecidable prop- 
erty. 


Exercise 3.1.12. A first-order sentence 7 of vocabulary T U {<} is order- 
invariant on a class K of 7-structures if its truth on any structure in K does 
not depend on the choice of the linear ordering <. That is, for any A € K 
and any pair <, <’ of linear orderings on 2% we have that (2,<) Ew <> 
(A, <’) = w. Prove that it is undecidable whether a given first-order formula 
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is order-invariant on finite structures. Hint: use Trakhtenbrot’s Theorem. A 
first-order sentence Y, in which < and Q do not occur, has a finite model 
with at least two elements if, and only if, y — Vady(a < y V Qx) is not 
order-invariant. 


3.2 Capturing Complexity Classes 


We have already mentioned that the research programme of descriptive com- 
plexity theory links complexity with logic in a deeper way than a complexity 
analysis of model-checking algorithms can do. We are looking for results say- 
ing that, on a certain domain D of structures, a logic L (such as first-order 
logic, least fixed-point logic, or a fragment of second-order logic) captures a 
complexity class Comp. This means that (1) for every fixed sentence w € L, 
the data complexity of evaluating ~ on structures from D is a problem in the 
complexity class Comp, and (2) every property of structures in D that can be 
decided with complexity Comp is definable in the logic L. 

Two important examples of such results are Fagin’s Theorem, which says 
that existential second-order logic captures NP on the class of all finite struc- 
tures, and the Immerman-Vardi Theorem, which says that least fixed-point 
logic captures PTIME on the class of all ordered finite structures. On ordered 
finite structures, logical characterizations of this kind are known for all major 
complexity classes. On the other hand, it is not known, and it is one of the 
major open problems in the area, whether PTIME can be captured by any 
logic if no ordering is present. 


In Sect. 3.2.1, we prove Fagin’s Theorem and relate it it to the spectrum 
problem, which is a classical problem in mathematical logic. In Sect. 3.2.2, we 
make precise the notion of a logic capturing a complexity class on a domain 
of finite structures. We then show in Sect. 3.2.3 that on ordered structures, 
second-order Horn logic captures polynomial time. In Sects. 3.2.4 and 3.2.5, 
we discuss logics that capture logarithmic space complexity classes. 


3.2.1 Capturing NP: Fagin’s Theorem 


The spectrum of a first-order sentence w is the set of cardinalities of its finite 
models, i.e. 


spectrum(¢)) := {k € N: ~ has a model with k elements}. 


As early as 1952, Scholz [93] posed the problem of characterizing the class 
of spectra, i.e. the subsets S C N for which there exists a first-order sentence 
w such that spectrum() = S. A more specific problem is the complemen- 
tation problem for spectra, posed by Asser [7], who asked whether the 
complement of each spectrum is also a spectrum. 
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Note that the spectrum of a first-order sentence w of relational vocabulary 
T = {R1,..., Rm} can be viewed as the set of finite models of the existential 
second-order sentence JR; -+ - JRmY. Since all relation symbols are quantified, 
this is a sentence over the empty vocabulary, i.e. its models are just sets. 
Thus there is a one-to-one correspondence between the spectra of first-order 
sentences and the classes of finite models of existential second-order sentences 
over the empty vocabulary. If we allow different vocabularies for existential 
second-order sentences, this naturally leads to the notion of a generalized 
spectrum [43]. 


Definition 3.2.1. Existential second-order logic, sometimes denoted by ¥}, 
is the set of formulae of the form 4R,---AIR py, where m € N, Ri,...,Rm 
are relation symbols of any finite arity, and ọ is a first-order formula. A 
generalized spectrum is the class of finite models of a sentence in existential 
second-order logic. 


Example 3.2.2. The class of bipartite graphs is a generalized spectrum. It is 
defined by the sentence 


RVaVy(Exy —> (Ra > 7Ry)). 


WwW 


Exercise 3.2.3. Prove that the class of Hamiltonian graphs, the class of k- 
colourable graphs (for any fixed k), and the class of graphs that admit a perfect 
matching are generalized spectra. (A perfect matching in an undirected graph 
G = (V, E) is a set M C E of edges such that every node belongs to precisely 
one edge of M.) 


Theorem 3.2.4 (Fagin). Let K be an isomorphism-closed class of finite 
structures of some fixed non-empty finite vocabulary. Then K is in NP if 
and only if K is definable by an existential second-order sentence, i.e. if and 
only if K is a generalized spectrum. 


Proof. First, we show how to decide a generalized spectrum. Let w := 
4R,---diRmy be an existential second-order sentence. We shall describe 
a non-deterministic polynomial-time algorithm M which, given an encod- 
ing code(A,<) of a structure 2, decides whether A = y. First, M non- 
deterministically guesses relations R,,..., Rm on A. A relation R; is deter- 
mined by a binary string of length n™, where r; is the arity of R; and n = |A]. 
Then M decides whether (2, Ri,..., Rm)  y. Since ¢ is first-order, this can 
be done in logarithmic space and therefore in polynomial time. 

Hence the computation of M consists of guessing a polynomial number of 
bits, followed by a deterministic polynomial-time computation. Obviously, M 
decides the class of finite models of w. 


Conversely, let K be an isomorphism-closed class of 7-structures and let 
M be a non-deterministic one-tape Turing machine which, given an input 
code(2, <), decides in polynomial time whether 2 belongs to K. We shall 
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construct an existential second-order sentence y whose finite models are pre- 
cisely the structures in K. The construction given here is not quite the stan- 
dard one. It is optimized so that it can be easily adapted to other situations, 
in particular for giving a capturing result for PTIME (see Section 3.2.3). 

Let M = (Q,%,q0,F*,F~,6), where Q is the set of states, X is the 
alphabet of M, qo is the initial state, F* and F7 are the set of accepting and 
rejecting states, and ô : (Q x X) — P(Q x X x {-1,0,1}) is the transition 
function. Without loss of generality, we can assume that all computations of 
M for an input code(2, <) reach an accepting or rejecting state after at most 
n* — 1 steps (where n is the cardinality of 2). 

We represent a computation of M for an input code(2, <) by a tuple X of 
relations on A, and we shall construct a first-order sentence Ym of vocabulary 
TU{<}U {X} such that 


(A, < X) | Ym <> the relations X represent an accepting 
computation of M on code(2, <). 


To represent the n? time and space parameters of the computation we 
identify numbers up to n* — 1 with tuples in A’. Given a linear order, the 
associated successor relation and the least and greatest element are of course 
definable. Note, further, that if a successor relation S and constants 0, e for 
the first and last elements are available, then the induced successor relation 
yJ =T + 1 on k-tuples is definable by a quantifier-free formula 


V(A@ =eAyj =0)A Sains Ns = 5) 


i<k j<i j>i 


Hence, for any fixed integer m, the relation Y = T + m is also expressible. 
The description X of a computation of M on code(, <) consists of the 
following relations. 


(1) For each state q € Q, the predicate 
X4 := {T € A*: at time 7, M is in state q}. 

(2) For each symbol ø € X, the predicate 

Y, := {(€,@) € AF x A" : at time 7, cell Z contains the symbol o}. 
(3) The head predicate 

Z :={(£,a) € A} x A" : at time f, the head of M is on position G}. 

The sentence Ym is the universal closure of the conjunction 
START A COMPUTE A END. 


The subformula START enforces the condition that the configuration of 
M at time t = 0 is Co(A, <), the input configuration on code(2, <). Recall 
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that a good encoding is represented by first-order formulae 3,(Z) (condition 
(iii) of the definition of good encodings). We set 
START := X,,(0) A Z(0,0) A A (o(@) > Yo 0,2). 
oes 


The subformula COMPUTE describes the transitions from one configura- 
tion to the next. It is the conjunction of the formulae 


NOCHANGE := /\ (Yo, T)A TETA =E+1)AZE9) > VC 2) 


oes 
and 
CHANGE := À (PREla, > VY POst(d,o’, ml) 
qEQ (q',o’,m)€5(q,c) 
oes 
where 


PRE[g, 0] := Xq(f) A Z(E,2) AYE, T) At =t4+1 
POST{q',o’, m] := Xy E) A Yo (E, TAWE +mMm=7AZË,y)). 
NOCHANGE expresses the fact that the contents of tape cells that are not 
currently being scanned do not change from one configuration to the next, 
whereas CHANGE enforces the changes in the relations X4, Yo, and Z im- 


posed by the transition function. 
Finally, we have the formula 


END := À %0, 


qEF- 
which enforces acceptance by forbidding rejection. 
Claim 1. If M accepts code(A, <), then (A, <) H| (EX ym. 

This follows immediately from the construction of Ym, since for any ac- 
cepting computation of M on code(%, <) the intended meaning of X satisfies 
Ym. 

Claim 2. If (A, < X) H| Ym, then M accepts code(%, <). 


Suppose that (A, < X) H Ym. For any M-configuration C with state q, 
head position p, and tape content wo: Wpnr—ı € X*, and for any time j < në, 
let CONF[C, j] be the conjunction of the atomic statements that hold for C 
at time j, i.e. 


k 


CONF[C, j] := X4) A ZG, DA N Yu. G9) 
1=0 


where 7, and 7 are the tuples in A* representing the numbers j, p, and i. 
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(a) Let Co be the input configuration of M for input code(, <). Since (2, < 
X) = START, it follows that 


3 


(4, <, X) H CONF[Co, 0]. 


— 
io” 
Rue 


Owing to the subformula COMPUTE of Ym, we have, for all non-final 
configurations C and all j < në — 1, that 


Ym ACONFIC,j]) = V CONF[C’,j +1], 
C’€Next(C) 


where Next(C) = {C’: C Fm C’} is the set of successor configurations 
of C. It follows that there exists a computation 


Co(2, <) = Co Fm C Fm Ane Fm Chk—1 — Cena 


of M on code(2, <) such that, for all j < n*, 


(A, <, X)  CONF[C;, j]. 


(c) Since (2, <, X) H END, the configuration Cena is not rejecting. Thus, M 
accepts code(%, <). 


This proves Claim 2. Clearly, one can axiomatize linear orders in first-order 
logic. Hence 


AEK if WE (A <)(4X)(“< is a linear order” A wy). 


This proves that K is a generalized spectrum. 


Exercise 3.2.5. Prove that every set in NP can be defined by a Xł-sentence 
whose first-order part has an V*d*-prefix. Furthermore, prove that this cannot 
be reduced to V*. Finally, prove that it can be reduced to V* if 


(a) existential second-order quantification over function symbols is allowed, 
or 

(b) if we consider only ordered structures with an explicitly given successor 
relation and constants 0, e for the first and last elements. 


There are several interesting consequences of Fagin’s Theorem. First of 
all, the NP-completeness of SAT (the satisfiability problem for propositional 
logic) is an easy corollary of Fagin’s Theorem. 


Theorem 3.2.6 (Cook and Levin). SAT is NP-complete. 


Proof. It is obvious that SAT is an NP-problem. It remains to show that any 
problem K in NP can be reduced to SAT. Since, as explained above, words can 
be viewed as special kinds of finite structures, we can assume that K C Fin(r) 
for some finite vocabulary 7. By Fagin’s Theorem, there exists a first-order 
sentence w such that 
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K = {X € Fin(r) : AE AR, --- AR}. 


We now present a logspace reduction that associates with every input 
structure 2% € Fin(r) a propositional formula wy. Given 2, replace in Y 


e all subformulae 4ziy by Vaca Plri/ail, 
e all subformulae Vaiy by A,,¢4 y[2i/ai], and 
e all 7-atoms Pū by their truth values in 2. 


Since the 7-atoms can be evaluated efficiently, this translation is com- 
putable efficiently. Viewing the atoms R;ū as propositional variables, we have 
obtained a propositional formula pa such that 


AEK <> A = IR -o IRmY => Yy E€ SAT. 


Fagin’s Theorem is readily extended to the higher levels of the polynomial- 
time hierarchy , and thus to a correspondance between second-order logic and 
the polynomial-time hierarchy. 


Corollary 3.2.7. Let K be an isomorphism-closed class of finite structures of 
some fixed non-empty vocabulary T. Then code(K) is in the polynomial-time 
hierarchy PH if and only if there exists a second-order sentence w such that 
K is the class of finite models of w. 


In the statement of Fagin’s Theorem, we required the vocabulary to be non- 
empty. The case of the empty vocabulary, i.e. spectra, is different, because the 
natural way of specifying a finite set is to write down its size n in binary, and so 
the length of the encoding is logarithmic in n, whereas encodings of structures 
of non-empty vocabularies have polynomial length. The formula constructed 
in the proof of Fagin’s Theorem talks about computations that are polynomial 
in n, and hence, in the case of spectra, exponential in the length of the input. 
As a consequence, Fagin’s characterization of generalized spectra in terms of 
NP implies a characterization of spectra in terms of NEXPTIME. This has 
also been established in a different way in [71]. 


Corollary 3.2.8 (Jones and Selman). A set S C N is a spectrum if and 
only if S € NEXPTIME. 


Hence the complementation problem for spectra is really a complexity- 
theoretic problem: spectra are closed under complementation if, and only if, 
NEXPTIME = Co-NEXPTIME. 


Exercise 3.2.9. Prove that a set S C N is in EXPTIME if and only if it is a 
categorical spectrum, i.e. the spectrum of a first-order sentence that has, up 
to isomorphism, at most one model in any finite cardinality. 
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Fagin’s Theorem gives a precise correspondence between a logic and a com- 
plexity class: a property of finite structures is decidable in non-deterministic 
polynomial time exactly when it is definable in existential second-order logic. 
The same is true for the correspondence between the polynomial-time hierar- 
chy and SO, as given by Corollary 3.2.7. 

Note that the results on the model-checking complexity of first-order logic 
do not give such precise correspondences. We know by Theorem 3.1.5 and 
Corollary 3.1.8 that whenever a property of finite structures is first-order 
definable, it is decidable in LOGSPACE and in fact even in ALOGTIME. 
But we do not have a result giving the converse, and in fact the converse is 
false. There are computationally very simple properties of finite structures 
that are not first-order definable; one of them is the property of having an 
even number of elements. 

Hence the natural question arises of whether complexity classes other than 
NP and the polynomial-time hierarchy can also be precisely captured by log- 
ics. For most of the popular complexity classes, notably PTIME, we do not 
know whether this is possible on the domain of all finite structures. But we 
have a lot of interesting capturing results if we do not consider arbitrary 
finite structures, but certain specific domains. In particular we have close cor- 
respondences between logic and complexity for the domain of ordered finite 
structures. 

By a model class we always mean a class K of structures of a fixed 
vocabulary 7 that is closed under isomorphism, i.e. if Y € K and y 3, then 
also $ € K. We speak of a domain of structures instead, if the vocabulary 
is not fixed. For a domain D and vocabulary T, we write D(r) for the class of 
7-structures in D. 

Intuitively, a logic L captures a complexity class Comp on D if the L- 
definable properties of structures in D are precisely those that are decidable 
in Comp. Here is a more detailed definition. 


Definition 3.2.10. Let L be a logic, Comp a complexity class, and D a do- 
main of finite structures. We say that L captures Comp on PD if 


(1) For every vocabulary 7 and every sentence 7 € L(r), the model-checking 
problem for 7 on D(r) is in the complexity class Comp. 

(2) For every model class K C D(r) whose membership problem is in Comp, 
there exists a sentence Y E€ L(r) such that 


K={MeE Dir): AE wh. 


By Fagin’s Theorem, the logic X} captures NP on the domain of all finite 
structures, and by Corollary 3.2.7, second-order logic captures the polynomial- 
time hierarchy. 

We sometimes simply write L C Comp to say that condition (1) of Defini- 
tion 3.2.10 is satisfied for L and Comp on the domain of all finite structures. 
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A classical result, from the ‘prehistory’ of finite model theory, says that 
a language is regular (i.e. recognizable by a finite automaton) if, and only if, 
it is definable in monadic second-order logic (MSO). As words can be viewed 
as a special domain of structures, this is a capturing result in the sense of 
Definition 3.2.10. 


Theorem 3.2.11 (Biichi, Elgot, and Trakhtenbrot). On the domain of 
word structures, monadic second-order logic captures the regular languages. 


There are numerous extensions and ramifications of this theorem, most of 
them established in the context of automata theory. We refer to [95, 97] for a 
proof and further results. However, the emphasis of most of the work in finite 
model theory is on structures more complicated structures than words, and 
concerns complexity levels higher than the regular languages. 


3.2.3 Capturing Polynomial Time on Ordered Structures 


In this section, we present a logical characterization of polynomial time on 
ordered structures, in terms of second-order Horn logic. Other such charac- 
terizations will follow in subsequent sections. 


Definition 3.2.12. Second-order Horn logic, denoted by SO-HORN, is 
the set of second-order sentences of the form 


t 
Qi ki: QmRmVy1 =: Vys \ Ci 
i=l 


where Q; € {3, Y}, the R; are relation symbols, and the C; are Horn clauses 
with respect to R,,..., Rm. More precisely, each C; is an implication of the 
form 

H — B1 A- Abm 


where each (3; is either a positive atom R,Z, or a first-order formula that does 
not contain Rı,..., Rm. The conjunction 61 A+- A Bm is called the body of 
the clause. H, the head of the clause, is either an atom Rjz or the Boolean 
constant 0 (for false). 


Thus the first-order parts of the sentences in SO-HORN are universal Horn 
sentences with respect to the quantified predicates R,,..., Rm, but may use 
arbitrary first-order information about the ‘input predicates’ from the under- 
lying vocabulary. Y{-HORN denotes the existential fragment of SO-HORN, 
i.e. the set of SO-HORN sentences where all second-order quantifiers are ex- 
istential. 


Example 3.2.13. The problem GEN is a well-known P-complete problem [57, 
70]. It may be presented as the set of structures (A, S, f,a) in the vocabulary 
of one unary predicate S, one binary function f, and a constant a, such that 
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a is contained in the closure of S under f. Clearly, the complement of GEN 
is also P-complete. It is defined by the following sentence of X1-HORN: 


ARV yz ((Ry — Sy) A(Rfyz — Ry A Rz) A (0 — Ra)). 


Example 3.2.14. The circuit value problem (CVP) is also P-complete [57], 
even when restricted to circuits with a fan-in of 2 over NAND gates. Such a 
circuit can be considered as a structure (V, E, I+, I~, out), where (V, E) is a 
directed acyclic graph, J* and I~ are monadic predicates, and a is a constant. 
Here Exy means that node z is one of the two input nodes for y; J+ and I7 
contain the input nodes with values 1 and 0, respectively; and out stands for 
the output node. 

We shall take for granted that E is a connected, acyclic graph with a fan-in 
of 2, sources Jt UI~, and sink out. The formula JTIFVzVyVzy, where y is 
the conjunction of the clauses 


Tg — Itr 

Frl zx 

Ty + Fr ^A Exy 

Fz — Tz ^ Ezz \Ty\ Eyz Ny Fz 
0- TrA Fa 


Tx — x = out 
then states that the circuit (V, E, I*, I7, out) evaluates to 1. 


Exercise 3.2.15. To justify the definition of SO-HORN, show that the admis- 
sion of quantifiers over functions, or of first-order prefixes of a more general 


form, would make the restriction to Horn clauses pointless. Any such extension 
of SO-HORN has the full power of second-order logic. 


Theorem 3.2.16. Every sentence  € SO-HORN is equivalent to some sen- 
tence w' € XI-HORN. 


Proof. It suffices to prove the theorem for formulae of the form 


Y := YPIR: --- IRmVZ¢, 


where ọ is a conjunction of Horn clauses. An arbitrary formula in SO-HORN 
may then be brought to existential form by successively removing the inner- 
most universal second-order quantifier. We first prove the following claim. 


Claim. A formula IRVZp(P, R) € X1-HORN is true for all predicates P (on 
a given structure A) if it holds for those predicates P that are false at at most 
one point. 


Let k be the arity of P. For every k-tuple @, let P7 = A% — {@}, i.e. the 
predicate that is false at @ and true at all other points. By assumption, there 


exist predicates R” such that 
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T. 


(A, P, R ) K Vzy. 


Now, take any predicate P Æ A*, and let R; := zgP R?. We claim that 
(A, P, R) = VZy. 

Suppose that this is false; there then exists a relation P # AF, a clause C 
of y, and an assignment p : {21..., Zs} — A such that (X, P, R) = ~C[p]. We 
now show that there then exists a tuple T such that also (4, P7, R°) H ~C[p]. 

If the head of Cp] is Pū, then take @ = @ ¢ P. If the head of C[p] is 
Rū, then choose some T ¢ P such that U ¢ R7; such an @ must exist because 
u ¢ Ri. Finally, if the head is 0, take an arbitrary @ ¢ P. The head of Cp] 
is clearly false in (2, P7, R“). The atom Pū does not occur in the body of 
Clp], because @ ¢ P and all atoms in the body of C[p] are true in (2, P, R); 
all other atoms of the form PU that might occur in the body of the clause 
remain true for P% also. Moreover, every atom R,v in the body remains true 
if R; is replaced by R? (because R; C RẸ). This implies that the clause 
(A, P7, R") H =C[p], and thus 


(A, P7, R") K Avzy, 


which contradicts our assumption. 


Thus the claim has been established. This implies that the original formula 
w is equivalent to the conjunction 


ARVZy0 A Va(AR)VZ¢1, 


where yi and yo are obtained from ¢ by replacing every atom Pu by U4 Y 
(which is true iff u € PY), or by (ū = T) (which is always true), respectively. It 
is easy to transform this conjunction into an equivalent formula in X4-HORN. 


Theorem 3.2.17. If y € SO-HORN, then the set of finite models of y is in 
PTIME. 


Proof. We can restrict our attention to sentences Y% = IR; + +- 3IRmYZ A; Ci in 
S}-HORN. Given any finite structure 2 of appropriate vocabulary, we reduce 
the problem of whether 21 = w to the satisfiability problem for a propositional 
Horn formula by the same technique as in the proof of Theorem 3.2.6. 
Replace the universal quantifiers Vz; by conjunctions over the elements 
a; E A and omit the quantifier prefix. Then substitute in the body of each 
clause the first-order formulae that do not involve R1,..., Rm by their truth 
values in 2. If there is any clause that is already made false by this partial 
interpretation (i.e. the head is false and all atoms in the body are true), 
then reject w. Otherwise, omit all clauses that are already made true (i.e. 
the head is true or a conjunct of the body is false) and delete the conjuncts 
already interpreted from the remaining clauses. Consider the atoms R;ū as 
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propositional variables. The resulting formula is a propositional Horn formula 
whose length is polynomially bounded in the cardinality of XA and which is 
satisfiable if and only if XA w. The satisfiability problem for propositional 
Horn formulae can be solved in linear time. 


Theorem 3.2.18 (Gradel). On ordered structures, SO-HORN and 
X1-HORN capture PTIME. 


Proof. This follows from an analysis of our proof of Fagin’s Theorem. If the 
Turing machine M happens to be deterministic, then the sentence IX wy 
constructed in that proof can easily be transformed to an equivalent sentence 
in ©{-HORN. 

To see this, recall that Ym is the universal closure of START A 
NOCHANGE A CHANGE A END. The formulae START, NOCHANGE, and 
END are already in Horn form. The formula CHANGE has the form 


VAN (PRE[4, 0] = VV POST(d’, 0',m]), 


qEQ (q',0’,m)€46(q,c) 


where 


PRE(q, o] := X,@) A ZEB AYE, ZAP =F+1 
POST{[q’, 0’, m] := Xy E) A Yor E, T) AWW(E+m=FZAZE,7)). 


For a deterministic M, we have for each pair (q, 7) a unique value ô(q, o) = 
(q’,0’,m). In this case, the implication PRE[q, o] — POST[q’,o’,m] can be 
replaced by the conjunction of the Horn clauses 

PRElg,0] > Xy’) 
PRE[q, 0] > Yo (T, 7) 
PRE[q,o] AJ =F+m > Z(E,9). 


Exercise 3.2.19. Prove that, contrary to the case of Fagin’s Theorem, the 
assumption that a linear order is explicitly available cannot be eliminated, 
since linear orderings are not axiomatizable by Horn formulae. 


Exercise 3.2.20. In [47], where the results of this section were proved, a weaker 
variant of SO-HORN was used, in which the body may not contain arbitrary 
first-order formulae of the input vocabulary, but only atoms and negated input 
atoms. Prove that the two variants of SO-HORN are equivalent on ordered 
structures with a successor relation and with constants for the first and last 
elements, but not on ordered structures without a successor relation. Hint: 
sentences in the weak variant of SO-HORN are preserved under substructures, 
i.e. if A H y and B C A, then also B = w. 
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3.2.4 Capturing Logarithmic Space Complexity 


In this section and the next, we describe two approaches to defining logics 
that capture logarithmic space complexity classes on ordered structures. The 
first approach is based on restrictions of second-order logic, similarly to the 
definition of SO-HORN, whereas the second technique adds transitive closure 
operators to first-order logic. 


Definition 3.2.21. Second-order Krom logic, denoted by SO-KROM, is 
the set of second-order formulae 


t 
Qi Ra + QmRmYyı + Yys N Ci 
w=1 


where every clause C; is a disjunction of at most two literals of the form (=) RJ 
and of a first-order formula that does not contain Rı,..., Rm. Such formulae 
are Krom (i.e. in 2-CNF) with respect to the quantified predicates. Y}-KROM 
is the existential fragment of SO-KROM. The intersection of Y{-HORN and 
SLKROM is denoted by ©1-KROM-HORN. 


Example 3.2.22. The reachability problem (‘Is there a path in the graph (V, E) 
from a to 6?’) is complete for NUOGSPACE via first-order translations. Its 
complement is expressible by a formula from X1-KROM-HORN, 


ATVaVyVz (Tee A (Taz — Try A Eyz) A (0 — Tab). 


As in the case of SO-HORN, it is also known that every sentence of 
SO-KROM is equivalent to a sentence of Y}-KROM (see [47]). 


Proposition 3.2.23. For every sentence Ww E€ SO-KROM, the set of finite 
models of y is in NUOGSPACE. 


The proof is analogous to the proof of Theorem 3.2.17. It uses the fact 
that 2-SAT, the satisfiability problem for propositional Krom formulae, is in 
NLOGSPACE. On ordered structures, SO-KROM captures NLUOGSPACE. 
We shall indicate the general idea of the proof here. Suppose that M is an 
O(log n)-space-bounded non-deterministic Turing machine with an input tape 
carrying a representation code(2,<) of an input structure, and one or more 
separate work tapes. A reduced configuration of M reflects the control state 
of M, the content of the work tapes, and the positions of the heads on the 
input tape and the work tapes. Thus a configuration is specified by a reduced 
configuration together with the input. Given that reduced configurations of M 
for the input code(2, <) have a logarithmic length with respect to |A|, we can 
represent them by tuples € = c,,...,c, € A” for fixed r. The initial reduced 
configuration on any input code(2, <) is represented by the tuple 0. Assume 
that M has a single accepting state, say state 1, and let the first component of 
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the reduced configuration describe the state. The condition that Ņ represents 
an accepting configuration is then expressed by ACCEPT(Y) := (yı = 1). 
Further, it is not difficult (although it is somewhat lengthy) to write down 
a quantifier-free formula NEXT(Z, ¥) such that, for every successor structure 
(A, S,0, e) and every tuple T representing a reduced configuration, 


(A, S,0, e) | NEXT, d) 


if, and only if, d represents a reduced successor configuration of € for the input 
(2, <). Taking the disjunctive normal form NEXT(z%,7) = V; NEXT;(Z,9), 
we can express the staement that M does not accept the input code(2, <) by 
the sentence 


Ym :=ARVavG(ROA N (R7 — REA NEXT,(Z,9)) 


t 


A (0 — Ry ^ ACCEPT(9)). 


This proves that, on ordered structures, the complement of every problem 
in NLOGSPACE is definable in SO-KROM. Since NLOGSPACE is closed 
under complements, and since the formula Ym is in fact in © {}-KROM-HORN, 
we have proved the following result. 


Theorem 3.2.24 (Gradel). On ordered structures, the logics SO-KROM, 
X1-KROM, and ©}-KROM-HORN capture NUOGSPACE. 


Remark. The characterizations of P and NUOGSPACE by second-order Horn 
and Krom logics can also be reformulated in terms of generalized spectra. 
The notion of a generalized spectrum can be appropriately modified to the 
notions of a generalized Horn spectrum and a generalized Krom spectrum. 
Let a model class be any isomorphism-closed class of structures of some fixed 
finite signature. Fagin’s Theorem and Theorems 3.2.18 and 3.2.24 can then 
be summarized as follows: 


e A model class of finite structures is NP iff it is a generalized spectrum. 

e A model class of ordered structures is in P iff it is a generalized Horn 
spectrum. 

e A model class of ordered structures is in NLOGSPACE iff it is a generalized 
Krom spectrum. 


3.2.5 Transitive Closure Logics 


One of the limitations of first-order logic is the lack of a mechanism for un- 
bounded iteration or recursion. This has motivated the study of more power- 
ful languages that add recursion in one way or another to first-order logic. A 
simple but important example of a query that is not first-order expressible is 
reachability. By adding transitive closure operators to FO, we obtain a natural 
family of logics with a recursion mechanism. 
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Definition 3.2.25. Transitive closure logic, denoted by TC, is obtained 
by augmenting the syntax of first order logic by the following rule for building 
formulae: 

Let y(%, 7) be a formula with variables F = x1,...,v, and J = y1,.--, Yk; 
and let @ and v be two k-tuples of terms. Then 


[t Caz p(T, y)] (u, v) 


is a formula which says that the pair (U,U) is contained in the transitive 
closure of the binary relation on k-tuples that is defined by vy. In other words, 
A [tezz v(F,7)|(G, 6) if, and only if, there exist an n > 1 and tuples 
To,- -.,Ēn in A” such that To = T, Tn = b, and AK (ti, G41), for all i < n. 


Of course, it is understood that y can contain free variables other than 
T and 9; these will also be free in the new formula. Moreover, transitive clo- 
sure logic is closed under the usual first-order operations. We can thus build 
Boolean combinations of TC-formulae, we can nest TC-operators, etc. 


Example 3.2.26. A directed graph G = (V, E) is acyclic if, and only if, Œ |= 
Vz[teryLxy](z, z). It is well known that a graph is bipartite (2-colourable) if, 
and only if, it does not contain a cycle of odd length. This is expressed by the 
TC-formula VeVy([ter yx Ay A dzExz ^ Ezy|(x, y) > 7Eyz). 


Exercise 3.2.27. Show that, for every w € TC, the set of finite models of w 
is decidable in NLOGSPACE. 


The same idea as in the proof of Theorem 3.2.24 shows that, on ordered 
structures, TC captures NLOGSPACE. The condition that an O(log n)-space- 
bounded Turing machine M accepts code(2, <) is expressed by the formula 


3z(ACCEPT(Z) A [tcz NEXT(Z, )|(0, Z)). 


Theorem 3.2.28 (Immerman). On ordered structures, TC captures 
NLOGSPACE. 


An interesting variant of TC is deterministic transitive closure logic, 
denoted DTC, which makes definable the transitive closure of any determin- 
istic definable relation. The syntax of DTC is analogous to TC, allowing us 
to build formulae of the form [dtcz z y(%,7)|(w, 0), for any formula y(Z, 7). 
The semantics can be defined by the equivalence 


[dtcz y(E,9)|(W,0) = [tex y 22,7) AV2(9(Z.2) > 7 = FIG, 9). 


It is clear that transitive closures of deterministic relations can be checked 
by deterministic Turing machines using only logarithmic space. Conversely, 
acceptance by such machines amounts to deciding a reachability problem (‘is 
there an accepting configuration that is reachable from the input configura- 
tion?’) with respect to the successor relation Fm on configurations. Of course, 
for deterministic Turing machines, | m is deterministic. We already know that 
on ordered structures, yy is first-order definable, and hence acceptance can 
be defined in DTC. 
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Theorem 3.2.29 (Immerman). On ordered finite structures DTC captures 
LOGSPACE. 


In particular, separating DTC from TC on ordered finite structures would 
amount to separating the complexity classes LOGSPACE and NLOGSPACE. 
However, on the domain of arbitrary finite structures, we can actually separate 
these logics [51]. 

Given a graph G = (V, E), let 2G be the graph with vertex set V x {0,1} 
and edges ((u,7),(v,j)) for (u,v) € E,i,7 € {0,1}. It is easy to see that on 
the class of all ‘double graphs’ 2G, DTC collapses to FO. Take any tuple 
uŭ = (u1,%1),---, (Uk, ik) of vertices in a double graph 2G, and let the closure 
of @ be the set {u1,..., up} x {0,1}. Switching the second component of any 
node is an automorphism of 2G, and hence no definable deterministic path 
from @ can leave the closure of U. That is, if 2G } [dtcz 7y(Z, y) (u, 0), then 
each node of v belongs to the closure of u. Therefore DTC-definable paths 
are of bounded length, and can thus be defined by first-order formulae. On 
the other hand the usual argument (based on Ehrenfeucht—Fraissé games) 
showing that transitive closures are not first-order definable applies also to 
the class of double graphs. Hence DTC is strictly less powerful than TC on 
double graphs. In [51] other graph classes are identified on which TC is more 
expressive than DTC. An interesting example is the class of all hypercubes. 


Theorem 3.2.30. On finite graphs, DTC Ç TC. 


TC is a much richer and more complicated logic than DTC also in 
other respects. For instance, DTC has a positive normal form: formulae 
-[dtczz(%, y)| (u,v) can be rewritten using the dtc operator only positively. 
On the other hand, the alternation hierarchy in TC is strict [52]. 


3.3 Fixed-Point Logics 


One of the distinguishing features of finite model theory compared with other 
branches of logic is the eminent role of various kinds of fixed-point logics. 
Fixed-point logics extend a basic logical formalism (such as first-order logic, 
conjunctive queries, or propositional modal logic) by a constructor for forming 
fixed points of relational operators. 

What do we mean by a relational operator? Note that any formula 
w(R, Z) of vocabulary 7 U{R} can be viewed as defining, for every r-structure 
A, an update operator Fy, : P(A*) — P(A*) on the class of k-ary relations on 
A, namely 

Fy: Re {a: (A, R) E v(R,a)}. 


A fixed point of Fy is a relation R for which Fy,(R) = R. In general, a fixed 
point of Fy need not exist, or there may exist many of them. However, if R 
happens to occur only positively in y, then the operator Fy, is monotone, and 
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in that case there exists a least relation R C A* such that Fy,(R) = R. The 
most influential fixed-point formalisms in logic are concerned with least (and 
greatest) fixed points, so we shall discuss these first. In finite model theory, 
a number of other fixed-point logics are important as well, and the structure, 
expressive power, and algorithmic properties of these logics have been studied 
intensively. We shall discuss them later. 


3.3.1 Some Fixed-Point Theory 


There is a well-developed mathematical theory of fixed points of monotone 
operators on complete lattices. A complete lattice is a partial order (A, <) 
such that each set X C A has a supremum (a least upper bound) and an 
infimum (a greatest lower bound). Here we are interested mainly in power 
set lattices (P(A*), C) (where A is the universe of a structure), and later in 
product lattices (P(B1) x- - -x P(Bm), C). For simplicity, we shall describe the 
basic facts of fixed-point theory for lattices (P(B), C), where B is an arbitrary 
(finite or infinite) set. 


Definition 3.3.1. Let F : P(B) — P(B) be a function. 


(1) X C Bis a fixed point of F if F(X) = X. 

(2) A least fixed point or a greatest fixed point of F is a fixed point X 
of F such that X C Y or Y C X, respectively, for each fixed point Y of 
F. 

(3) F is monotone, if X C Y => F(X) C F(Y) for all X,Y C B. 


Theorem 3.3.2 (Knaster and Tarski). Every monotone operator F : 
P(B) — P(B) has a least fired point lfp(F) and a greatest fixed point gfp(F). 
Further, these fixed points may be written in the form 


lfp(F) = (KX : F(X) =X} =(\{X: F(X) ¢ X} 
gfp(F) ={X : F(X) = X} = (JX : F(X) 2 X} 


Proof. Let S={X C B: F(X) C X} and Y = A S. We first show that Y is 
a fixed point of F. 


F(Y) CY. Clearly, Y C X for all X € S. As F is monotone, it follows that 
F(Y) C F(X) € X. Hence F(Y) CNS =Y. 

Y C F(Y). As F(Y) C Y, we have F(F(Y)) C F(Y), and hence F(Y) € S. 
Thus Y = NS C F(Y). 


By definition, Y is contained in all X such that F(X) C X. In particular 
Y is contained in all fixed points of F. Hence Y is the least fixed point of F. 
The argument for the greatest fixed point is analogous. 
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Least fixed points can also be constructed inductively. We call an operator 
F : P(B) — P(B) inductive if the sequence of its stages X° (where a is an 
ordinal), defined by 


Xr:= U X“ for limit ordinals A, 
a<r 


is increasing, i.e. if X? C X° for all 8 < a. Obviously, monotone operators are 
inductive. The sequence of stages of an inductive operator eventually reaches 
a fixed point, which we denote by X°. The least ordinal 8 for which X? = 
Xt! = X@ is called cl(F), the closure ordinal of F. 


Lemma 3.3.3. For every inductive operator F : P(B) > P(B), |cl(F)| < 
|B|. 


Proof. Let |B|* denote the smallest cardinal greater than |B|. Suppose that 
the claim is false for F. Then for each a < |B|* there exists an element 
£a E€ X°t1— X*. The set {xa : a < |B\t} is a subset of B of cardinality 
|B|* > |B|, which is impossible. 


Proposition 3.3.4. For monotone operators, the inductively constructed 
fixed point coincides with the least fixed point, i.e. X® = lfp(F). 


Proof. As X® is a fixed point, lfp(X) C X°. For the converse, we show by 
induction that X° C lfp(F) for all a. As lfp(F) = Q{Z : F(Z) C Z}, it 
suffices to show that X® is contained in all Z for which F(Z) C Z. 

For a = 0, this is trivial. By monotonicity and the induction hypothesis, 
we have X°+! = F(X°%) C F(Z) C Z. For limit ordinals À with X® C Z for 


all a < À we also have X* = Uac) E Z. 


The greatest fixed point can be constructed by a dual induction, starting 
with Y° = B, by setting Y°+! := F(Y°%) and Yà = Maca Y® for limit 
ordinals. The decreasing sequence of these stages then eventually converges 
to the greatest fixed point Y® = gfp(F). 

The least and greatest fixed points are dual to each other. For every mono- 
tone operator F, the dual operator F4 : X ++ F(X) (where X denotes the 
complement of X) is also monotone, and we have that 


lfp(F) = gfp(F?) and gfp(F) = lIfp(F9). 
Exercise 3.3.5. Prove this. 


Everything said so far holds for operators on arbitrary (finite or infinite) 
power set lattices. In finite model theory, we consider operators F : P(A*) > 
P(A*) for finite A only. In this case the inductive constructions will reach the 
least or greatest fixed point in a polynomial number of steps. As a consequence, 
these fixed points can be constructed efficiently. 
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Lemma 3.3.6. Let F : P(A*) — P(A*) be a monotone operator on a finite 
set A. If F is computable in polynomial time (with respect to |A|), then so are 
the fixed points lfp(F) and gfp(F). 


3.3.2 Least Fixed-Point Logic 


LFP is the logic obtained by adding least and greatest fixed points to first- 
order logic. 


Definition 3.3.7. Least fixed-point logic (LFP) is defined by adding to the 
syntax of first-order logic the following least fixed-point formation rule: If 
(R,T) is a formula of vocabulary 7 U {R} with only positive occurrences of 
R, if Z is a tuple of variables, and if f is a tuple of terms (such that the lengths 
of Z and ¢ match the arity of R), then 


(lfpRkz . W|(t) and [gfpRz . y]®) 


are formulae of vocabulary 7. The free first-order variables of these formulae 
are those in (free(q) — {x : x in T}) U free(Z). 

Semantics. For any T-structure 2 providing interpetations for all free variables 
in the formula, we have that A | [lfpRz . y] (t) if 7 (the tuple of elements 
of 2 interpreting t) is contained in lfp(Fy,), where Fy is the update operator 
defined by w on 2. Similarly for greatest fixed points. 


Example 3.8.8. Here is a fixed-point formula that defines the transitive closure 
of the binary predicate EF: 


TC(u, v) := [lfpT zy . Exy V 3z(Exz AT zy)|(u, v). 


Note that in a formula [lfpRZ . p] (t), there may be free variables in y addi- 
tional to those in %, and these remain free in the fixed-point formula. They 
are often called parameters of the fixed-point formula. For instance, the 
transitive closure can also be defined by the formula 


plu, v) := [lfpTy . Euy V da(Tx A Exy)|(v) 
which has u as a parameter. 


Exercise 3.3.9. Show that every LFP-formula is equivalent to one without 
parameters (at the cost of increasing the arity of the fixed-point variables). 


Example 3.3.10. Let y := Vy(y < x — Ry) and let (A, <) be a partial order. 
The formula [lfpRx . y|(x) then defines the well-founded part of <. The 
closure ordinal of Fy on (A, <) is the length of the longest well-founded initial 
segment of <, and (A, <) H Va{[lfpRz . y](x) if, and only if, (A,<) is well- 
founded. 
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Exercise 3.3.11. Prove that the LFP-sentence 


q := Yy3zFyz AVyllfpRy . Yx(Fry —> Rx)\(y) 
is an infinity axiom, i.e. it is satisfiable but does not have a finite model. 


Example 3.3.12. The GAME query asks, given a finite game G = (V, Vo, Vi, E), 
to compute the set of winning positions for Player 0 (see Section 3.1.3). The 
GAME query is LFP-definable, by use of [lfpWa . y](x) with 


g(W, x) := (Vor A Jy(Exy A Wy)) V (Vi A Yy(Ezy > Wy)). 


The GAME query plays an important role for LFP. It can be shown that 
every LFP-definable property of finite structures can be reduced to GAME by 
a quantifier-free translation [31]. Hence GAME is complete for LFP via this 
notion of reduction, and thus a natural candidate if one is trying to separate 
a weaker logic from LFP. 


Exercise 3.3.13. Prove that the problem GEN and the circuit value problem 
(see Examples 3.2.13 and 3.2.14) are expressible in LFP. 


The duality between the least and greatest fixed points implies that for 
any formula w, 


[gfpRz . Y|) = ~[lfpRT . -y[R/-R]](), 


where ¢)[R/—R] is the formula obtained from w by replacing all occurrences of 
R-atoms by their negations. (As R occurs only positively in w, the same is true 
for 77)[R/-R].) Because of this duality, greatest fixed points are often omitted 
in the definition of LFP. On the other hand, it is sometimes convenient to keep 
the greatest fixed points, and to use the duality (and de Morgan’s laws) to 
translate LFP-formulae to negation normal form, i.e. to push negations all 
the way to the atoms. 


Capturing Polynomial Time 


From the fact that first-order operations are polynomial-time computable and 
from Lemma 3.3.6, we can immediately conclude that every LFP-definable 
property of finite strucures is computable in polynomial time. 


Proposition 3.3.14. Let y be a sentence in LFP. It is decidable in poly- 
nomial time whether a given finite structure A is a model of w. In short, 
LFP C PTIME. 


Obviously LFP, is a fragment of second-order logic. Indeed, by the Tarski- 
Knaster Theorem, 


[IfpRT . Y(R, T) (9) = VR((Vz(Y(R,Z) > RT)) > Ry). 
We next relate LFP to SO-HORN. 
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Theorem 3.3.15. Every formula y E€ SO-HORN is equivalent to some for- 
mula w* € LFP. 


Proof. By Theorem 3.2.16, we can assume that w = (4R1)---(ARn)y € 
X1-HORN. By combining the predicates R1,..., Rm into a single predicate 
R of larger arity and by renaming variables, it is easy to transform w into an 
equivalent formula 


W := aRvavy N Ci A N D3, 
i j 


where the C; are clauses of the form RT — a;(R, z, Y) (with exactly the same 
head R7 for every i) and the D; are clauses of the form 0 — (;(R,z,y). The 
clauses C; define, on every structure 2, a monotone operator F : R > {7T : 
V; sya:(%,7)}. Let R” be the least fixed point of this operator. Obviously 
A H aw if and only if A = 6;(R“,a,b) for some i and some tuple g, b. But 
R” is defined by the fixed-point formula 


a” (T) := [IfpR7 . V Wa; (Z, 7)|(Z). 


a 


Hence, for 6 := 373p V; 6;(T,7), Y is equivalent to the formula y* := 
=6|RZ/a’(Z)| obtained from 7G by substituting all occurrences of atoms RZ 
by a“ (Z). Clearly, this formula is in LFP. 


Hence SO-HORN < LFP < SO. As an immediate consequence of Theo- 
rems 3.2.18 and 3.3.15 we obain the Immerman-Vardi Theorem. 


Theorem 3.3.16 (Immerman and Vardi). On ordered structures, least 
fixed-point logic captures polynomial time. 


However, on unordered structures, SO-HORN is strictly weaker than LFP. 


3.3.3 The Modal p-Calculus 


A fragment of LFP that is of fundamental importance in many areas of com- 
puter science (e.g. controller synthesis, hardware verification, and knowledge 
representation) is the modal p-calculus (L,,). It is obtained by adding least 
and greatest fixed points to propositional modal logic (ML). In other words 
L,, relates to ML in the same way as LFP relates to FO. 

Modal logics such as ML and the p-calculus are evaluated on transition 
systems (alias Kripke structures, alias coloured graphs) at a particular node. 
Given a formula Ww and a transition system G, we write G,v = w to de- 
note that G holds at node v of G. Recall that formulae of ML, for reasoning 
about transition systems G = (V, (Ea)ac4, (Ps)sez), are built from atomic 
propositions P, by means of the usual propositional connectives and the modal 
operators (a) and [a]. That is, if w is a formula and a € A is an action, then 
we can build the formulae (a)y and [a], with the following semantics: 
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G,v = (ayy iff G,w H y for some w such that (v, w) € Ea, 
G,v E laly iff G, w H w for all w such that (v, w) € Ea. 


If there is only one transition relation, i.e. A = {a}, then we simply write 
and © for [a] and (a), respectively. 

ML can be viewed as an extension of propositional logic. However, in our 
context it is more convenient to view it as a simple fragment of first-order logic. 
A modal formula 7 defines a query on transition systems, associating with G 
a set of nodes Ww? := {v : G,v H y}, and this set can be defined equivalently 
by a first-order formula 7)*(«). This translation maps atomic propositions P, 
to atoms P,x, it commutes with the Boolean connectives, and it translates 
the modal operators by use of quantifiers as follows: 


((a)y)"(@) == Jy(Eary A Y* (y)) 
(lal) *(x) := Vy(Eary > 0" (y)). 


Note that the resulting formula has width 2 and can thus be written with 
only two variables. We have proved the following proposition. 


Proposition 3.3.17. For every formula ~ E€ ML, there exists a first-order 
formula %*(x) of width 2, which is equivalent to w in the sense that G, v = w 


iff GE y* (v). 


The modal fragment of first-order logic is the image of propositional modal 
logic under this translation. It has turned out that the modal fragment has 
interesting algorithmic and model-theoretic properties (see [3] and the refer- 
ences given there). 


Definition 3.3.18. The modal p-calculus L, extends ML (including 
propositional variables X,Y,..., which can be be viewed as monadic second- 
order variables) by the following rule for building fixed point formulae: If w is 
a formula in L, and X is a propositional variable that only occurs positively 
in y, then uX. and vX.y are also L,,-formulae. 

The semantics of these fixed-point formulae is completely analogous to 
that for LFP. The formula w defines on G (with universe V, and with inter- 
pretations for other free second-order variables that y may have besides X) 
the monotone operator Fy : P(V) => P(V) assigning to every set X C V the 
set YF(X) := {v EV: (G, X), v H y}. Now, 


G, v F uX .y iff v € lfp(Fọ) 
G, v F vX y iff v € gfp(Fọ). 


Example 3.3.19. The formula uX.p V (a)X asserts that there exists a path 
along a-transitions to a node where ọ holds. 


The formula 4 := vX.{ Vaca (a) true A AacalalX ) expresses the assertion 
that the given transition system is deadlock-free. In other words, G,v H w 
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if no path from v in G reaches a dead end (i.e. a node without outgoing 
transitions). 

Finally, the formula vX.uY.(a)((p A X) VY) says that there exists a path 
from the current node on which y holds infinitely often. 


Exercise 3.3.20. Prove that the formulae in Example 3.3.19 do indeed express 
the stated properties. 


The translation from ML into FO is readily extended to a translation from 
L, into LFP. 


Proposition 3.3.21. Every formula Y% E L, is equivalent to a formula 
w*(a) € LFP. 


Proof. By induction. A formula of form uX.ọ is translated to [IfpX = . y*](x), 
and similarly for greatest fixed points. 


Further the argument proving that LFP can be embedded into SO also 
shows that L, is a fragment of MSO. 

Let us turn to algorithmic issues. The complexity of the model-checking 
problem for L,, is a major open problem, as far as combined complexity and 
expression complexity are concerned (see Section 3.3.5). However, the data 
complexity can be settled easily. 


Proposition 3.3.22 (data complexity of L,,). Fic any formula w € Ly. 
Given a finite transition system G and a node v, it can be decided in polyno- 
mial time whether G,v = w. Further, there exist p € L, for which the model 
checking problem is PTIME-complete. 


Proof. As L, is a fragment of LFP, the first claim is obvious. For the second 
claim, recall that the GAME problem for strictly alternating games is PTIME- 
complete (see Section 3.1.2). Player 0 has a winning strategy from position 
v € Vo in the game G = (V, Vo, Vi, E) if, and only if, G, v = pxX.O0Xx. 


Despite this result, it is not difficult to see that the p-calculus does not 
suffice to capture PTIME, even in very restricted scenarios such as word 
structures. Indeed, as L, is a fragment of MSO, it can only define regular 
languages, and of course, not all PTIME-languages are regular. However, we 
shall see in Section 3.5.3 that there is a multidimensional variant of L, that 
captures the bistmulation-invariant fragment of PTIME. 

For more information on the p-calculus, we refer to [5, 21, 56] and the 
references therein. 


3.3.4 Parity Games 


For least fixed-point logics, the appropriate evaluation games are parity games. 
These are games of possibly infinite duration where each position is assigned 
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a natural number, called its priority, and the winner of an infinite play is 
determined according to whether the least priority seen infinitely often during 
the play is even or odd. It is open whether winning sets and winning strategies 
for parity games can be computed in polynomial time. The best algorithms 
known today are polynomial in the size of the game, but exponential with 
respect to the number of priorities. Practically competitive model-checking 
algorithms for the modal u-calculus work by solving the strategy problem for 
the associated parity game (see e.g. [73]). 


Definition 3.3.23. We describe a parity game by a labelled graph G = 
(V, Vo, Vi, E, 2), where (V, Vo, Vi, E) is a game graph as in Section 3.1.2, and 
92: V — N assigns to each position a priority. The set V of positions may 
be finite or infinite, but the number of different priorities must be finite; it is 
called the index of G. Recall that a finite play of a game is lost by the player 
who gets stuck, i.e. cannot move. The difference to the games of Section 3.1.2 
is that we have different winning conditions for infinite plays voviv2... . If the 
smallest number appearing infinitely often in the sequence (vp) Q2(v1)... of 
priorities is even, then Player 0 wins the play; otherwise, Player 1 wins. 


Recall that a positional strategy of Player o is a partial function f : 
V> > V with (v, f(v)) € E. A strategy f is said to be winning on a set of 
positions W C V if any play that starts at a position in W and is consistent 
with f is winning for Player co. Further, W,, the winning region of Player ø, 
is the set of positions from which Player ø has a winning strategy (which, a 
priori, need not be positional). 


Exercise 3.3.24. (Combination of positional strategies). Let f and f’ 
be positional strategies for Player ø that are winning on the sets W and W”, 
respectively. Let f < f’ be the positional strategy defined by 


f(z) if#ew 
f'(x) otherwise. 


(F < F')(2) = l 


Prove that f < f’ is winning on W U W”. 


The Positional Determinacy Theorem for parity games states that parity 
games are always determined (i.e., from each position, one of the players has 
a winning strategy) and in fact, positional strategies always suffice. This was 
proved independently by Emerson and Jutla [40] and by Mostowski [86]. Ear- 
lier, Gurevich and Harrington [62] had proved that Muller games (which are 
more general than parity games) are determined via finite-memory strategies. 


Theorem 3.3.25 (Positional Determinacy). In any parity game, the set 
of positions can be partitioned into two sets Wo and Wı such that Player 0 
has a positional strategy that is winning on Wo and Player 1 has a positional 
strategy that is winning on W1. 
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Here, we only prove this theorem for the case of finite game graphs. The 
presentation is inspired by a similar proof due to Ehrenfeucht and Mycielski 
[39] for mean payoff games; see also [12]. For the general case, we refer the 
reader to [102] or [97]. 


Proof. Let G = (V,Vo,Vi, E, 2) be a parity game with a finite set V of po- 
sitions. We call a position v € V live if it is non-terminal (i.e. if there is at 
least one possible move from v). The theorem trivially holds for games with 
at most one live position. We now proceed by induction over the number of 
live positions. 

For every live position v in G and for ø = 0,1, we define the game G[v, o], 
which is the same as G except that we change v to a terminal position where 
Player ø wins. (Technically this means that we put v into Vi_, and delete 
all outgoing edges from v.) By the induction hypothesis, the Forgetful Deter- 
minacy Theorem holds for G[v,c], and we write Wo[v, o] and W|v, a] for the 
winning regions of G[v, o]. 

It suffices to show that for every live position u in G, one of the players has 
a positional strategy to win G from u. By Exercise 3.3.24, these strategies can 
then be combined into positional strategies that win on the entire winning 
regions. 

Clearly, 

Wolv, 1] Cc Wo and Wi [v, 0] Cc Wi. 


Moreover, any positional strategy f for Player o that is winning from position 
u in the game G[iv,1 — g] is also winning from u in the game G and avoids v 
(i.e. no play that starts at u and is consistent with f ever hits position v). 
Now let 
Ag := U W,|v, 1 — ø]. 
v live 

We call positions u € A, strong winning positions for Player o because, 
informally speaking, Player o can win G from u even if she gives away some 
live positions to her opponent. Similarly, positions outside Ao U A, are called 
weak positions. It remains to show that from weak positions also, one of the 
players has a positional winning strategy. In fact, one of the players wins, with 
a positional winning strategy, from all weak positions. 

By the induction hypothesis, if u is not in A,_,, then, for all live positions 
v of G, we have that u € W,[v,o] and, moreover, Player o has a positional 
strategy f, by which, starting at any position u ¢ A,_,, she either wins or 
eventually reaches v. 

We distinguish two cases, depending on whether or not there exist strong 
winning positions that are live (terminal positions are, of course, always 
strong). 


Case (i). Suppose that there exists a live position v € As. In this case, 
Player ø also wins from every weak position u. 
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We already know that Player ø has a positional strategy f to win G from 
v, and a positional strategy f, by which she either wins G or reaches v from 
u. We can easily combine f and f, into a positional winning strategy f* to 
win G from u: we set f*(x) := f(x) if f is winning from z, and f*(x) := fy(x) 
otherwise. 


Case (ii). Suppose now that all live positions are weak. We claim that in 
this case, Player 0 wins from all live (i.e. all weak) positions if the minimal 
priority on G is even, and Player 1 wins from all live positions if the minimal 
priority is odd. 

Since all live positions are weak, we already know that Player ø has, for 
every live position y, a positional strategy fy by which she either wins or 
reaches y from any live position in G. 

Take now a live position v of minimal priority, and put ø = 0 if Q(v) is 
even, and ø = 1 if (Q(v) is odd. In addition, pick any live position w # v. 
We combine the positional winning strategies fẹ, and fw into a new positional 
strategy f with 


f(x) := es ife=v 


fo(a) otherwise. 


We claim that f is a winning strategy for Player o from all live positions 
of G. If a play in G in which Player 0 moves according to f hits v only finitely 
often, then this play eventually coincides with a play consistent with f,, and is 
therefore won by Player ø. But if the play hits v infinitely often, the minimal 
priority seen infinitely often is 2(v), and hence Player ø wins also in this case. 


Exercise 3.3.26. Let G be a parity game with winning sets Wọ and W1. 
Obviously every positional winning strategy for Player 0 has to remain inside 
Wo, ie. f(Vo N Wo) C Wo. However, remaining inside the winning region 
does not suffice for winning a game! Construct a parity game and a positional 
strategy f for Player 0 such that all plays consistent with f remain insiside 
Wo, yet are won by Player 1. Hint: a trivial game with two positions suffices. 


Exercise 3.3.27. A future game is any game on a game graph G = 
(V, Vo, Vi, E) where the winning condition does not depend on finite prefixes 
of plays. This means that whenever m = vovi ++: and a’ = vovi: are two 
infinite plays of G such that for some n and M UmUm41°+* = Vh Uhp1 +++, then 
ma and 7’ are won by the same player. Obviously parity games are a special 
case of future games. 

Prove that for every future game G, the winning region of Player 0 is a 
fixed point (not necessarily the least one) of the operator Fy, defined by the 
formula Y(X) := (VoAOX)V(Vi AOX). Since Fy is monotone, the least and 
greatest fixed points exist, and lfp(Fy) C Wo C gfp(F,). Find conditions (on 
parity games) implying that Wo = Ifp(Fy,) or that Wo = gfp(Fy). 
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Theorem 3.3.28. It can be decided in NP N Co-NP whether a given position 
in a parity game is a winning position for Player 0. 


Proof. A node v in a parity game G = (V, Vo, Vi, E, 2) is a winning position 
for Player ø if there exists a positional strategy f : V> — V which is winning 
from position v. It therefore suffices to show that the question of whether a 
given f : V> — V is a winning strategy for Player ø from position v can 
be decided in polynomial time. We prove this for Player 0; the argument for 
Player 1 is analogous. 

Given G and f : Vo — V we obtain a reduced game graph Gy = (V, Ey) 
by keeping only the moves that are consistent with f, i.e. 


Eş ={(v,w): (v E€ Vo Aw = f(v)) V (v E€ Vizo A (v, w) € E}. 


In this reduced game, only the opponent, Player 1, makes non-trivial moves. 
We call a cycle in (V, Ef) odd if the smalest priority of its nodes is odd. 
Clearly, Player 0 wins G from position v via strategy f if, and only if, in Gr, 
no odd cycle and no terminal position w € Vo are reachable from v. Since the 
reachability problem is solvable in polynomial time, the claim follows. 


In fact, Jurdziński [72] proved that the problem is in UP N Co-UP, where 
UP denotes the class of NP-problems with unique witnesses. The best known 
deterministic algorithms to compute winning partitions of parity games have 
running times that are polynomial with respect to the size of the game graph, 
but exponential with respect to the index of the game [73]. 


Theorem 3.3.29. The winning partition of a parity game G = 
(V, Vo, V1, E, 2) of index d can be computed in space O(d - |E|) and 


time 
o(s (a) ) 


The Unfolding of a Parity Game 


Let G = (V, Vo, V1, E, Q) be a parity game. We assume that the minimal 
priority in the range of 92 is even, and that every node v with minimal priority 
has a unique successor s(v) (ie. vE = {s(v)}). This is no loss of generality. 
We can always tranform a parity game in such a way that all nodes with non- 
maximal priority have unique successors (i.e. choices are made only at the 
least relevant nodes). If the smallest priority in the game is odd, we consider 
instead the dual game (with the roles of the players switched and priorities 
decreased by one). 

Let T be the set of nodes with minimal priority and let G~ be the game 
obtained by deleting from G all edges (v, s(v)) € T x V so that the nodes in 
T become terminal positions. We define the unfolding of G as a sequence of 
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games G* (where a ranges over the ordinals) which all coincide with G~ up 
to the winning conditions for the terminal positions v € T. For every a, we 
define a decomposition T = TẸ U TF, where T is the set of v € T in which 
we declare, for the game G®, Player o to be the winner. Further, for every 
a, we write WẸ for the winning set of Player ø in the game G®. Note that 
W$ depends of course on the decomposition T = TẸ U Tf (this also applies 
concerning positions outside T). In turn, the decomposition of T for a+ 1 
depends on the winning sets W$ in G*. We set 


fie T 
Tet := {v ET : s(v) € Wo} 
TA = N T5 for limit ordinals A. 
a<r 


By determinacy, V = Wj U WF for all a, and with increasing a, the 
winning sets of Player 0 are decreasing and the winning sets of Player 1 are 
increasing: 


We 2 Wi 2---We 2 Ws 
WP CW C-.-WRoWP C... 


IU 


Hence there exists an ordinal a (whose cardinality is bounded by the car- 
dinality of V) for which WẸ = WoT! =: WẸ and We = WPH! =: WP. We 
claim that these fixed points coincide with the winning sets Wọ and W, for 
the original game G. 


Lemma 3.3.30 (Unfolding Lemma). Wo = W5° and Wi = WP. 


Proof. It suffices to define a strategy f for Player 0 and a strategy g for Player 
1 for the game G, by means of which Player ø wins from all positions v € W2°. 

First, we fix a winning strategy f° for Player 0 in G“, with winning set 
WS = We. Note that f° can be trivially extended to a strategy f for the 
game G, since the nodes in T have unique successors in G. We claim that f is 
in fact a winning strategy in G from all positions v € W`. 

To see this, consider any play vpvjv2... in G from position vo € WÒ 
against f. Such a play can never leave WẸ. If v; € WẸ \ T, then vi41 E WS 
because f is a winning strategy for G“; and if v; Ee WONT = Wer NT, then 
vi € Tt", which implies, by the definition of TS*", that vjz1 = s(vi) € WẸ. 
But a play that never leaves WẸ is necessarily won by Player 0: either it goes 
only finitely often through positions in T, and then coincides from a certain 
point onwards with a winning play in G®, or it goes infinitely often through 
positions in T, in which case Player 0 wins because the minimal priority that 
is hit infinitely often is even. 
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To construct a winning strategy for Player 1 in the game G, we define, for 
every node v € WP, the ordinal 


plv) := min{8 : v € WP}. 


We fix, for every ordinal a, a winning strategy g“ for Player 1 with winning 
set Wf in the game G“, and set 


glv) = g7™ (v) for all v € V1 \T 


and g(v) := s(v) for v € V AT. 
Consider any play voviv2... in G from position v9 € WP against g. We 
claim that whenever v; € WP, then 


(1) vita EWP, 
(2) p(vi+1) < p(vi), and 
(3) if v; € T, then p(vi41) < p(v;). 


If v; € WF \ T and p(vi) = a, then v; E€ Wf, and therefore (since Player 1 
moves locally according to his winning strategy g“ and Player 0 cannot leave 
winning sets of her opponent) v4.41 E€ WF. But if v; € WF OT and p(v;) = a, 
then v; € Tf, a = G+ 1 is a successor ordinal, and 14,41 = s(vj) € we (by 
the definition of Tf’). Hence p(vi4i) < 8 < p(v;). 

Properties (1), (2), and (3) imply that the play stays inside WP° and that 
the values p(v) are decreasing. Since there are no infinite strictly descending 
chains of ordinals, the play eventually remains inside WF, for a fixed a, and 
outside T (since moves from T would reduce the value of a(v)). Hence the 
play eventually coincides with a play in G® in which Player 1 plays according 
to his winning strategy g“. Thus, Player 1 wins. 


3.3.5 Model-Checking Games for Least Fixed-Point Logic 


For the purpose of defining evaluation games for LFP-formulae and analysing 
the complexity of model checking, it is convenient to make the following as- 
sumptions. First, the fixed-point formulae should not contain parameters (the 
reason for this will be discussed below). Second, the formula should be in nega- 
tion normal form, i.e. negations apply to atoms only, and third, it should be 
well-named, i.e. every fixed-point variable is bound only once and the free 
second-order variables are distinct from the fixed-point variables. We write 
D,(T) for the unique subformula in w of the form [fpTZ.y(T,Z)] (where fp 
means either lfp or gfp). For technical reasons, we assume, finally, that each 
fixed-point variable T occurs in D,(T) only inside the scope of a quantifier. 
This is a common assumption that does not affect the expressive power. We 
say that T” depends on T if T occurs free in Dy,(Z"). The transitive closure 
of this dependency relation is called the dependency order, denoted by C 
The alternation level al,(T) of T in ~ is the maximal number of ARA 
tions between least and greatest fixed-point variables on the Cy-paths from 
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T. The alternation depth ad(4%) of a fixed-point formula 7 is the maximal 
alternation level of its fixed point variables. 

Consider now a finite structure 2 and an LFP-formula Y(T), which we 
assume to be well-named, in negation normal form, and without parameters. 
The model-checking game G(A, W(@)) is a parity game. As in the case of first- 
order logic, the positions of the game are expressions (b), i.e. subformulae 
of w that are instantiated by elements of 2. The initial position is w(@). The 
moves are as in the first-order game, except for the positions associated with 
fixed-point formulae and with fixed-point atoms. At such positions there is a 
unique move (by Falsifier, say) to the formula defining the fixed point. For a 
more formal definition, recall that as w is well-named, there is, for any fixed- 
point variable T in w, a unique subformula [fp TZ . y(T,%)|(¥). From position 
[fpTz . y(T,Z)|(b), Falsifier moves to y(T,b), and from any fixed point atom 
TT, she moves to the position y(T,7@). 

Hence the case where we do not have fixed points the game is the usual 
model-checking game for first-order logic. Next, we consider the case of a 
formula with only one fixed-point operator, which is an lfp. The intuition 
is that from position [lfp TZ . y(T,z)|(b), Verifier tries to establish that b 
enters T at some stage a of the fixed-point induction that is defined by y 
on A. The game goes to y(T,b) and from there, as y is a first-order formula, 
Verifier can either win the y-game in a finite number of steps, or force it to 
a position Tc, where T enters the fixed point at some stage 8 < a. The game 
then resumes at position Y(T), associated again with y. As any descending 
sequence of ordinals is finite, Verifier will win the game in a finite number 
of steps. If the formula is not true, then Falsifier can either win in a finite 
number of steps or force the play to go through infinitely many positions of 
the form Tz. Hence, these positions should be assigned priority 1 (and all 
other positions higher priorities) so that such a play will be won by Falsifier. 
For gfp-formulae, the situation is reversed. Verifier wants to force an infinite 
play, going infinitely often through positions TT, so gfp-atoms are assigned 
priority 0. 

In the general case, we have a formula Y% with nested least and greatest 
fixed points, and in an infinite play of G(2l,u(@)) one may see different fixed 
point variables infinitely often. But one of these variables is then the smallest 
with respect to the dependency order Cy. It can be shown that 2 H w iff this 
smallest variable is a gfp-variable (provided the players play optimally). 

Hence, the priority labelling should assign even priorities to gfp-atoms 
and odd priorities to lfp-atoms. Further, if T Cy T’ and T,T” are fixed-point 
variables of different kinds, then T-atoms should get a lower priority than 
T’-atoms. 

As the index of a parity game is the main source of difficulty in computing 
winning sets, the number of different priorities should be kept as small as 
possible. We can avoid the factor of 2 appearing in common constructions of 
this kind by adjusting the definitions of the alternation level and alternation 
depth, setting al} (T) := aly(T) + 1 if aly(T) is even or odd and T is an lfp- 
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variable or a gfp-variable, respectively. In all other cases, al}, (T) = aly (T). 
Finally, let ad* (4) be the maximal value of ad}, (T) for the fixed-point variables 
in w. The priority labelling 2 on positions of G(A, y) is then defined by 
Q(Tb) = al (T) for fixed-point atoms, and 2(y(b)) = ad*(q) for all other 
formulae. 

This completes the definition of the game G(2,~(@)). Note that the pri- 
ority labelling has the properties described above, and that the index of 
G(Al, y(T)) is at most ad(w) + 1. 


Theorem 3.3.31. Let Y(T) be a well-named and parameter-free LFP -formula 
in negation normal form, and let A be a relational structure. AE Y(T) if and 
only if Player 0 has a winning strategy for the parity game G(Q, w(@)). 


Proof. This is proved by induction on w. The interesting case concerns fixed- 
point formulae Y(T) := [gfpTZ. y(Z)|(Z). 

In the game G(A,~(@)), the positions of minimal priority are the fixed- 
point atoms Tb, which have unique successors (b). By the induction hypoth- 
esis we know that, for every interpretation To of T, (X, To) | (a) iff Player 0 
has a winning strategy for G((2l, To), p(@)). By the unfolding of greatest fixed 
points, we also know that «A = [gfpTT. p(T) (@) if (A, T*) H| v(@) for all 
approximations T. 

By ordinal induction, one can immediately see that the games 
G((A, T“), p(@)) coincide with the unfolding of the game G = G(X, y(@)) to 
the games G“. By the Unfolding Lemma, we conclude that Player 0 wins the 
game G(2,w(a)) if, and only if, she wins all games G® which is the case if, 
and only if, (XA, T“) H y(@) for all a, which is equivalent to 2 = w(@). 

For least fixed-point formulae we proceed by dualization. 


Clearly, the size of the game G(2,~(@)) (and the time complexity of its 
construction) is bounded by |el(q)| - |A|¥4*8(). Hence, for LFP-formulae of 
bounded width, the size of the game is polynomially bounded. 


Corollary 3.3.32. The model-checking problem for LFP-formulae of bounded 
width (and without parameters) is in NP N Co-NP, in fact in UP N Co-UP. 


As formulae of the y-calculus can be viewed as LFP-formulae of width 2, 
the same bound applies to L,,. (For a different approach to this problem, which 
does not mention games explicitly, see [100].) It is a well-known open problem 
whether the model-checking problem for L, can be solved in polynomial time. 


Exercise 3.3.33. Prove that if the model-checking problem for L, can be 
solved in polynomial time, then the same is true for (parameter-free) LFP- 
formulae of width k, for any fixed k €e N. Hint: given a finite structure 
A = (A, R,,...,Rm), with relations of R; of arities r; < k, let G*() be 
the transition system with universe A*, unary relations R¥ = {(a1,...,@x) : 
(a1,...,@r,) E€ Ri} and Lj; = {(a1,...,a@x) : a; = aj}, and binary relations 


E; = {(@,b) : a; = b; for i # j} (for j = 1,...,k) and Es = {(@,b) : bi = 


168 3 Finite Model Theory and Descriptive Complexity 


Qo) for i=1,...,k} for each substitution o : {1,...,k} — {1,..., k}. Trans- 
late formulae Y € LFP of width k into formulae y* € L, such that A = w(@) 
iff G* (2),a  w*. (See [55, pp. 110-111] for details.) 


By Theorem 3.3.29, we obtain the following deterministic complexity 
bounds for LFP model checking. 


Theorem 3.3.34. Given a finite structure X and a formula Y(T) of width k 
and alternation depth d, it can be decided whether A = W(G) in space O(d- 
|el(w)| - |A|*) and time 


o(e (etna) ) 


Corollary 3.3.35. The model-checking problem for LFP-formulae of bounded 
width and bounded alternation depth is solvable in polynomial time. 


Fixed-Point Formulae with Parameters 


We have imposed the condition that the fixed-point formulae do not contain 
parameters. If parameters are allowed, then, at least with a naive definition of 
width, Corollary 3.3.32 is no longer true (unless UP = PSPACE). The intuitive 
reason is that parameters allow us to ‘hide’ first-order variables in fixed-point 
variables. Indeed, Dziembowski [37] proved that QBF, the evaluation problem 
for quantified Boolean formulae, can be reduced to evaluating LFP-formulae 
with two first-order variables (but an unbounded number of monadic fixed- 
point variables) on a fixed structure with three elements. Hence the expression 
complexity of evaluating such formulae is PSPACE-complete. A similar argu- 
ment works for the case where also the number of fixed-point variables is 
bounded, but the structure is not fixed (combined complexity rather than 
expression complexity). We remark that the collection of all unwindings in 
infinitary logic of LFP-formulae with k variables, including parameters, is not 
contained in any bounded width fragment of infinitary logic. 


LFP-Formulae of Unbounded Width 


For LFP-formulae of unbounded width, Theorem 3.3.34 gives only an expo- 
nential time bound. In fact, this cannot be improved, even for very simple 
LFP-formulae [99]. 


Theorem 3.3.36 (Vardi). The model-checking problem for LFP-formulae 
(of unbounded width) is EXPTIME-complete, even for formulae with only one 
fixed-point operator, and on a fixed structure with only two elements. 


We defer the hardness proof to Section 3.3.10, where we shall show that 
the expression complexity is EXPTIME-hard even for Datalog, which is a 
more restricted formalism than LFP. 
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3.3.6 Definability of Winning Regions in Parity Games 


We have seen that the model-checking problem for the u-calculus or LFP can 
be reduced to the problem of computing winning regions in parity games. 
In fact, there is also a reduction in the reverse direction. We can represent 
any parity game G = (V, Vo, V1, E, 2) with a priority function 2 : V —> 
{0,...d — 1} by a transition system (V, E, Vo, Vi, Po,...,Pa-1), where P; = 
{V : Q(v) = i}. We can then construct, for every fixed d € N, a formula Wing 
of the p-calculus that defines the winning region of Player 0 in any parity 
game with priorities 0,...,d — 1. We set 


d—1 
Wina = VXouXivX>ə . .ÀAXd—1 VV ((Vo A P; A ©X;) vV (Vi A P; A X;)). 
j=0 


In this formula, the fixed-point operators alternate between v and u, and 
hence \ = v if d is odd, and A = p if d is even. 


Theorem 3.3.37. For every d € N, the formula Wina defines the winning 
region of Player 0 in parity games with priorities 0,...,d—1. 


Proof. We have to show that, for any parity game G = (V, Vo, Va, Po,..-, Pa—1) 
and every position v € V, 


G,v = Wing <> Player 0 has a winning strategy for G from v. 


To see this, let G* be the model-checking game for the formula Wing on 
G,v and identify Verifier with Player 0 and Falsifier with Player 1. Hence, 
Player 0 has a winning strategy for G* if, and only if, G,v H} Wing. 

By the construction of model-checking games, G* has positions of the form 
(y, u), where u € V and y is a subformula of Wing. The priority of a position 
(X;,u) is i, and when ọ is not a fixed point variable, the priority of (y, u) is 
d. 


We claim that the game G* is essentially, i.e. up to elimination of stupid 
moves and contraction of several moves into one, the same as the the original 
game G. To see this, we compare playing G from a current position u € Vo UP; 
with playing G* from any position (Yp, u), where vy, is the subformula of Wing 
that starts with vX_, or wXz. 

In G, Player 0 selects at position u a successor w € uE, and the 
play proceeds from w. In G*, the play goes from (yx, u) through positions 
(Yr41,U)...,(~a-1, 4) to (V, u), where 


d—1 
v = \J (Vo A Pj AOX;)V (Vi A Pj A DX;)). 
j=0 


The only reasonable choice for Verifier (Player 0) at this point is to move to 
the position (Vo A P; A OX;,u), since with any other move she would lose 
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immediately. But from there, the only reasonable move of Falsifier (Player 1) 
is to go to position (©X;, u), and it is now the turn of Player 0 to select a 
successor w E€ vE and move to position (X;, w) from which the play proceeds 
to (yi, w). 

Thus one move from u to w in G corresponds to a sequence of moves in G* 
from (Yp, u) to (pi, w), but the only genuine choice is the move from (©X;, u) 
to (X;,w), ie. the choice of a successor w € uE. In G, the position u has 
priority i, and in G* the minimal, and hence relevant, priority that is seen in 
the sequence of moves from (yx, u) to (yi, w) is that of (X;, u) which is also å. 
The situation for positions u € Vi N P; is the same, except that the play in G* 
now goes through (OX;, u) and it is Player 1 who selects a successor w € uE 
and moves to (X;, w). 

Hence the (reasonable) choices that have to be made by the players in G* 
and the relevant priorities that are seen are the same as in a corresponding 
play of G. Thus, Player 0 has a winning strategy for G from v if, and only if, 
Player 0 has a winning strategy for G* from position (yo, v). But since G* is 
the model-checking game for Wing on G, with initial position (yo, v), this is 
the case if, and only if, G,v H} Wing. 


Corollary 3.3.38. The following three problems are algorithmically equiva- 
lent, in the sense that if one of them admits a polynomial-time algorithm, 
then all of them do. 


(1) Computing winning regions in parity games. 

(2) The model-checking problem for LFP-formulae of width at most k, for 
any k > 2. 

(3) The model-checking problem for the modal u-calculus. 


The formulae Wing also play an important role in the study of the alter- 
nation hierarchy of the modal u-calculus. Clearly, Wing has alternation depth 
d and it has been shown that there is no formula in the p-calculus with alter- 
nation depth < d can be equivalent to Wing. Hence the alternation hierarchy 
of the -calculus is strict [4, 20]. 


3.3.7 Simultaneous Fixed-Point Inductions 
A more general variant of LFP permits simultaneous inductions over several 
formulae. A simultaneous induction is based on on a system of operators of 


the form 


Fi : P(B,) ie DK P(Bm) —, P(Bi) 


Fim ; P(Bı) > Ta i P(Bm) nes P(Bm), 


forming together an operator 
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F= (F;,..., Fm) : P(B1) <---> x P(Bm) — P(B1) x ++: x P(Bm). 


Inclusion on the product lattice P(B1) x --- x P(Bm) is componentwise. Ac- 
cordingly, F is monotone if, whenever X; C Y; for all i, then also F;(X) C 
F;(Y) for all i. 

Everything said above about least and greatest fixed points carries over 
to simultaneous induction. In particular, a monotone operator F has a least 
fixed point lfp(F) which can be constructed inductively, starting with y= 
(0,...,0) and iterating F until a fixed point X” is reached. 

One can extend the logic LFP by a simultaneous fixed point formation 
rule. 


Definition 3.3.39. Simultaneous least fixed-point logic, denoted by S- 
LFP, is the extension of first-order logic by the following rule. 


Syntaz. Let wi(R,71),.-.,Um(R,Zm) be formulae of vocabulary r U 
{Ri,...,Rm}, with only positive occurrences of Rı,..., Rm, and, for each 
i < m, let T; be a sequence of variables matching the arity of Ri. Then 


Rızı = 
S:= : 
Rmim = Vm 


is a system of update rules, which is used to build formulae [lfp R; : S](€) and 
[gfp R; : S|(t) (for any tuple t of terms whose length matches the arity of R;). 


Semantics. On each structure &, S defines a monotone operator 9% = 
(S1,..-,5m) mapping tuples R = (R1,..., Rm) of relations on A to S*(R) = 
(91(R),.--,Sm(R)) where $;(R) := {a : (A, R) H vi(R,a)}. As the oper- 
ator is monotone, it has a least fixed point lfp(9%) = (RY,..., R). Now 
A H [IfpR; : S](@) if a € Re. Similarly for greatest fixed points. 


Example 3.3.40. We return to the circuit value problem for circuits with fan- 
in 2 and NAND gates (see Example 3.2.14). Simultaneous LFP-definitions of 
the nodes evaluating to true and false in the given circuit (V, Æ, I", I7) are 
given by the formulae [lfpT : S](z) and |[lfpF : S](z), respectively, where S' is 
the system 


Tz := I*z V Jz(Exz ^A Fz) 
Fz := I zV 3zðJy(Exz A Eyz ^x #y^TzrATy). 


Elimination of Simultaneous Fixed-Points 


The question arises of whether simultaneous fixed points provide more ex- 
pressive power than simple ones. We shall prove that this is not the case. 
Simultaneous least fixed points can be simulated by nested simple ones, via a 
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technique that is sometimes called the Bekic principle [5]. We shall consider 
only the case of two monotone operators 


F : P(A) x P(B) > P(A) 
G : P(A) x P(B) > P(B). 


We write (F'°, G°) for the least fixed point of the combined operator (F, G). 
For any fixed X C A, the operator Gx : P(B) > P(B) with Gx(Y) := 
G(X,Y) is also monotone, and therefore has a least fixed point Ifp(Gx) C B. 


Lemma 3.3.41. The operator E on P(A), defined by E(X) := 
F(X, lfp(Gx)), is monotone and has the least fixed point lfp(E) = F°. 


Proof. If X C X’, then a trivial induction shows that G% C GŠ, for all 
stages G% and GS, of the induced operators Gx and Gx. As a consequence, 
Ifp(Gx) C lfp(Gx-) and E(X) = F(X,lfp(Gx)) C F(X’, lfp(Gx)) = 
E(X’). This shows that E is monotone. 

Note that Ifp(Grx) C G®, because Gruo(G®) = G(F®,G®) = G&. 
Hence G% is a fixed point of Gro and therefore contains the least fixed point 
lfp(Gpre). Further, 


E(F°) = F(F° lfp(Gpro)) C F(F®,G®) = F”. 
As lfp(E) = {X : E(X) C X} it follows that lfp(£) C F”. 
It remains to show that F® C lfp(E£). We proceed by induction, showing 
that the stages (F®, G®) of the operator (F, G) and the stages E“ of E satisfy 
(F*, G°) C (Ifp(£), fp (Girp(z))- 


For a = 0, this is clear. Further, 


Fet! = F(F*,G*) C F(lfp(£), lfp(Gigpc2))) = E(lfp(E) = lfp(E) 
Get! = G(F*,G*) C G(lfp(£), lfp(Gigp(z))) = Gitpce) (1fp(Gitp(e)) 


Finally, for limit ordinals the induction argument is trivial. 
We are now ready to show that for any system 


RıTı = 
S:= : 
Rmim = Vm 


the formulae [lfp R; : S](Z) are equivalent to simple LFP formulae. Further, 
the translation does not increase the number and arity of the fixed-point 
variables R1,..., Rm, nor the alternation depth (i.e. the changes between least 
and greatest fixed points). It therefore remains valid for interesting fragments 
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of LFP, such as monadic LFP and alternation-free LFP, and also for the modal 
p-calculus (see [5]). It does, however, increase the nesting depth of fixed-point 
operators. (We remark that there are alternative elimination techniques that 
do not increase the nesting depth, but instead augment the arity of the fixed- 
point operators.) 


Theorem 3.3.42. S-LFP = LFP. 


Proof. Obviously LFP is contained in S-LFP. For the converse, we restrict 
our attention to simultaneous inductions over two formulae. The general case 
is treated by analogous arguments. 


Given a system 
RE := w( 
S:= 5 
o = 


we claim that 


[lfp R: S|) = [lfpRT . Y(R, [lfpT7 . y) ©) 
[lfp T : S|(v) = [lfpT7. y([lfp RT . Y], T). 


We shall prove the first equivalence. We fix a structure 2{ and consider 
the operator S% = (F,G) with F : (R,T) = {@: A = W(R,T,a)} and 
G: (R,T) œ> {4: AE y(R,T,@)}. Writing (F°, G®) for the least fixed 
point of (F, G) we have that 2 } [lfp R: S|(a) if a e F”. 

The formula ~(R,[lfpT7 . yl) defines on 2 the operator E : R => 
F(R, lfp(Gr)) with Gr : T + G(R,T), and we have that A = [lfpRT . 
Y(R, [lfpTy . yl) (@) iff a € lfp(E). But, by the previous lemma, F® = 
Ifp(£). 

While we have shown that simultaneous fixed points do not provide more 


expressive power, they permit us to write formulae in a more modular and 
more readable form. 


Positive LFP 


While LFP and the modal p-calculus allow arbitrary nesting of least and 
greatest fixed points, and arbitrary interleaving of fixed points with Boolean 
operations and quantifiers, classical studies of inductive definability over first- 
order logic (such as [85]) focus on a more restricted logic. Let LFP (sometimes 
also called positive LFP) be the extension of first-order logic that is obtained 
by taking least fixed points of positive first-order formulae (without param- 
eters) and closing them under disjunction, conjunction, and existential and 
universal quantification, but not under negation (for a more formal definition, 
see the Chap. 2. LFP, can be conveniently characterized in terms of simul- 
taneous least fixed points. We just state the result; for a proof see Chap. 2 
again. 
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Theorem 3.3.43. A query is definable in LFP if and only if it is defin- 
able by a formula of the form [lfpR : S|(Z), where S is a system of update 
rules RiT := y;(R,T) with first-order formulae pi. Moreover, we can require, 
without diminishing the expressive power, that each of the formulae p; in the 
system is either a purely existential formula or a purely universal formula. 


3.3.8 Inflationary Fixed-Point Logic 


LFP is only one instance of a logic with an explicit operator for forming 
fixed points. A number of other fixed-point extensions of first-order logic (or 
fragments of it) have been extensively studied in finite model theory. These 
include inflationary, partial, non-deterministic, and alternating fixed point 
logics. All of these have in common that they allow the construction of fixed 
points of operators that are not necessarily monotone. 

An operator G : P(B) — P(B) is called inflationary if G(X) 2 X for 
all X C B. With any operator F one can associate an inflationary operator 
G, defined by G(X) := X U F(X). In particular, inflationary operators are 
inductive, so iterating G yields a fixed point, called the inflationary fixed 
point of F. 


Exercise 3.3.44. Prove the following facts. (1) Monotone operators need not 
be inflationary, and inflationary operators need not be monotone. (2) An 
inflationary operator need not have a least fixed point. (3) The least fixed point 
of an inflationary operator (if it exists) may be different from the inductive 
fixed point. (4) However, if F is a monotone operator, then its inflationary 
fixed point and its least fixed point coincide. 


The logic IFP is defined with a syntax similar to that of LFP, but without 
the requirement that the fixed-point variable occurs only positively in the 
formula, and with a semantics given by the associated inflationary operator. 


Definition 3.3.45. IFP is the extension of first-order logic by the following 
fixed-point formation rule. For every formula (R,T), every tuple 7 of vari- 
ables, and every tuple ¢ of terms (such that the lengths of z and t match the 
arity of R), we can build a formula [ifp RT . y] (£). 


Semantics. On a given structure A, we have that A | [ifpRE . ~](¢) if 7 
is contained in the union of the stages R“ of the inflationary operator Gy 
defined by Gy (R) := RU Fy(R). 


By the last item of Exercise 3.3.44, least and inflationary inductions are 
equivalent for positive formulae, and hence IFP is at least as expressive as 
LFP. On finite structures, inflationary inductions reach the fixed point after 
a polynomial number of iterations, hence every IFP-definable class of finite 
structures is decidable in polynomial time. 


Proposition 3.3.46. IFP captures PTIME on ordered finite structures. 
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Least Versus Inflationary Fixed-Points 


As both logics capture PTIME, IFP and LFP are equivalent on ordered finite 
structures. What about unordered structures? It was shown by Gurevich and 
Shelah [63] that the equivalence of IFP and LFP holds on all finite structures. 
Their proof does not work on infinite structures, and indeed there are some 
important aspects in which least and inflationary inductions behave differ- 
ently. For instance, there are first-order operators (on arithmetic, say) whose 
inflationary fixed point is not definable as the least fixed point of a first-order 
operator. Further, the alternation hierarchy in LFP is strict, whereas IFP has 
a positive normal form (see Exercise 3.3.52 below). Hence it was conjectured 
by many that IFP might be more powerful than LFP. However, Kreutzer [80] 
showed recently that IFP is equivalent to LFP on arbitrary structures. Both 
proofs, by Gurevich and Shelah and by Kreutzer, rely on constructions show- 
ing that the stage comparison relations of inflationary inductions are definable 
by lfp inductions. 


Definition 3.3.47. For every inductive operator F : P(B) > P(B), with 
stages X“ and an inductive fixed point X°°, the F-rank of an element b € B 
is |blp := min{a: b € X°} if b € X”, and |b|r = oo otherwise. The stage 
comparison relations of G are defined by 

a<rpb iff jalr < |blp <o 

a<xrb iff lale < |b|r. 
Given a formula y(R,Z), we write <, and <, for the stage comparison rela- 
tions defined by the operator F, (assuming that it is indeed inductive), and 


<ii and so for the stage comparison relations of the associated inflationary 
operator Gy : Re RU {@: AE v(R,a)}. 


Example 3.3.48. For the formula Y(T, x, y) := Exy V 3z(Exz A Tyz) the rela- 
tion <, on a graph (V, Æ) is distance comparison: 


(a,b) <o (c,d) iff dist(a,b) < dist(c, d). 


Stage comparison theorems are results about the definability of stage com- 
parison relations. For instance, Moschovakis [85] proved that the stage com- 
parison relations <, and <, of any positive first-order formula y are definable 
by a simultaneous induction over positive first-order formulae. For results on 
the equivalence of IFP and LFP one needs a stage comparison theorem for 
IFP inductions. 

We first observe that the stage comparison relations for IFP inductions 
are easily definable in IFP. For any formula y(T,Z), the stage comparison 
relation agint is defined by the formula 


lifpT < y. p|Tu/u < F(z) A oy|TU/t < 79) T, 9). 


However, what we need to show is that the stage comparison relation for IFP 
inductions is in fact LFP-definable. 
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Theorem 3.3.49 (Inflationary Stage Comparison). For any formula 
y(R,Z) in FO or LFP, the stage comparison relation ee is definable in 
LFP. On finite structures, it is even definable in positive LFP. 


See [38, 63] for proofs in the case of finite structures and [80] for the more 
difficult construction in the general case. From this result, the equivalence of 
LFP on IFP follows easily. 


Theorem 3.3.50 (Kreutzer). For every IFP-formula, there is an equivalent 
LFP-formula. 


Proof. For any formula y(R,%), lifpRT . y|(Z) = p47: 7 ae T}, T). 


Stage comparison theorems also have other interesting consequences. For 
instance, Moschovakis’s Theorem implies that on finite structures, greatest 
fixed points (i.e. negations of least fixed points) can be expressed in positive 
LFP. This gives a normal form for LFP and IFP (see [67]). 


Theorem 3.3.51 (Immerman). On finite structures, every LFP-formula 
(and hence also every IFP-formula) is equivalent to a formula in LFP. 


This result fails on infinite structures. On infinite structures, there exist 
LFP formulae that are not equivalent to positive formulae, and in fact the 
alternation hierarchy of least and greatest fixed points is strict (see [20, 85]). 


Exercise 3.3.52. Prove that every IFP-formula is equivalent to one that uses 
ifp-operators only positively. Hint: assuming that structures contain at least 
two elements and that a constant 0 is available, a formula —[ifpRZ . ~(R,7)| 
is equivalent to an inflationary induction on a predicate Tz y which, for y Æ 0, 
simulates the induction defined by w, checks whether the fixed point has been 
reached, and then makes atoms TTO true if T is not contained in the fixed 
point. 


In finite model theory, owing to the Gurevich-Shelah Theorem, the two 
logics LFP and IFP have often been used interchangeably. However, there are 
significant differences that are sometimes overlooked. Despite the equivalence 
of IFP and LFP, inflationary inductions are a more powerful concept than 
monotone inductions. The translation from IFP-formulae to equivalent LFP- 
formulae can make the formulae much more complicated, requires an increase 
in the arity of fixed-point variables and, in the case of infinite structures, in- 
troduces alternations between least and greatest fixed points. Therefore it is 
often more convenient to use inflationary inductions in explicit constructions, 
the advantage being that one is not restricted to inductions over positive for- 
mulae. For an example, see the proof of Theorem 3.5.26 below. Furthermore, 
IFP is more robust, in the sense that inflationary fixed points remain well de- 
fined even when other non-monotone operators (e.g. generalized quantifiers) 
are added to the language (see, for instance, [35]). 
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The differences between least and inflationary fixed points are particu- 
larly significant in the context of modal logic, i.e. when we compare the 
modal p-calculus L, with its inflationary counterpart. For instance, L, has 
the finite-model property, the satisfiability problem is decidable (complete 
for EXPTIME), the model-checking problem is in NP N Co-NP (and conjec- 
tured by many to be solvable in polynomial time), and there are practical, 
automata-based techniques for solving the algorithmic problems associated 
with L,,. Finally, in terms of expressive power, L, can be characterized as the 
bisimulation-invariant fragment of monadic second-order logic (MSO) [69]. 
On the other hand, the inflationary counterpart of L, the model iteration 
calculus (MIC) [33], behaves very differently. The finite-model property fails, 
the satisfiability problem is undecidable (and not even in the arithmetic hier- 
archy), the model-checking problem is PSPACE-complete, and the expressive 
power goes beyond monadic second-order logic even on words. The appropri- 
ate model-checking games for inflationary fixed-point logics such as IFP and 
MIC are backtracking games [34]. These games are a generalization of par- 
ity games with an additional rule allowing players, under certain conditions, to 
return to an earlier position in the play and revise a choice or to force a count- 
back on the number of moves. This new feature makes backtracking games 
more powerful so that they can capture inflationary inductions. Accordingly, 
winning strategies become more complex objects and computationally harder 
than for parity games. 


3.3.9 Partial Fixed-Point Logic 


Another fixed-point logic that is relevant to finite structures is the partial 
fixed-point logic (PFP). Let #(R,%) be an arbitrary formula defining on a 
finite structure 2 a (not necessarily monotone) operator Fy : Rte {@: A H 
w(R,a@)}, and consider the sequence of its finite stages R° := 0, R™*! = 
Fy(R™). 

This sequence is not necessarily increasing. Nevertheless, as % is finite, the 
sequence either converges to a fixed point, or reaches a cycle with a period 
greater than one. We define the partial fixed point of Fy, as the fixed point 
that is reached in the former case, and as the empty relation otherwise. The 
logic PFP is obtained by adding to first-order logic the partial-fixed-point 
formation rule, which allows us to build from any formula Y(R, T) a formula 
[pfp RT . w(R,z)|(t), saying that t is contained in the partial fixed point of 
the operator Fy. 

Note that if R occurs only positively in w, then 


[lfp Rr. Y(R, T)](¢) = [pfp RT . Y(R,z)|(), 


so we have that LFP < PFP. However, PFP seems to be much more powerful 
than LFP. For instance, while a least-fixed-point induction on finite structures 
always reaches the fixed point in a polynomial number of iterations, a partial- 
fixed-point induction may need an exponential number of stages. 
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Example 8.3.53. Consider the sequence of stages R™ defined by the formula 


Y(R, x) := (Re A sy(y < aA =Ry)) v (-Re ^ VYy(y < z > Ry)) V Yy Ry 


on a finite linear order (A, <). It is easily seen than the fixed point reached 
by this induction is the set R = A, but before this fixed point is reached, the 
induction goes in lexicographic order through all possible subsets of A. Hence 
the fixed point is reached at stage 2” — 1, where n = |A]. 


Simultaneous Inductions. 


As in the case of LFP, one can also extend IFP and PFP by simultaneous in- 
ductions over several formulae, but again, the simultaneous fixed-point logics 
S-IFP and S-PFP are not more expressive than their simple variants. However, 
the proof is a little different than in the case of LFP. It requires that one en- 
codes several relations into one and hence increases the arity of the fixed point 
variables. As a consequence, it seems to be unknown whether simultaneous 
monadic PFP collapses to simple monadic PFP. 


Complexity 


Although a PFP induction on a finite structure may go through exponentially 
many stages (with respect to the cardinality of the structure), each stage can 
be represented with polynomial storage space. As first-order formulae can be 
evaluated efficiently, it follows by a simple induction that PFP-formulae can 
be evaluated in polynomial space. 


Proposition 3.3.54. For every formula p € PEP, the set of finite models of 
wy is in PSPACE; in short: PFP C PSPACE. 


On ordered structures, one can use techniques similar to those used in pre- 
vious capturing results, to simulate polynomial-space-bounded computation 
by PFP-formulae [2, 99]. 


Theorem 3.3.55 (Abiteboul, Vianu, and Vardi). On ordered finite struc- 
tures, PFP captures PSPACE. 


Proof. It remains to prove that every class K of finite ordered structures that 
is recognizable in PSPACE, can be defined by a PFP-formula. 

Let M be a polynomially space-bounded deterministic Turing machine 
with state set Q and alphabet X, recognizing (an encoding of) an ordered 
structure (2,<) if and only if (2,<) € K. Without loss of generality, we 
can make the following assumptions. For input structures of cardinality n, M 
requires space less than n —2, for some fixed k. For any configuration C of M, 
let Next(C)) denote its successor configuration. The transition function of M is 
adjusted so that Next(C’) = C if, and only if, C is an accepting configuration. 
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We represent any configuration of M with a current state 
q, tape inscription wy ,-::-Wm, and head position i, by the word 
Fwy +++ Wi_-1(Qwj) Wi41 +++ Wm-1# over the alphabet I := 1 U(Q x Y)U{F#}, 
where m = n* and # is merely used as an end marker to make the following 
description more uniform. When moving from one configuration to the next, 
Turing machines make only local changes. We can therefore associate with 
M a function f : r3 — I such that, for any configuration C = co- Cm, the 
successor configuration Next(C) = ch- -+ c, is determined by the rules 


O= ey and œ = f(ci-1,Ci,Ci41) tor 1<i<m-l1. 


Recall that we encode structures so that there exist first-order formulae 
Bo (T) such that (XA, <) H @.(@) if and only the ath symbol of the input con- 
figuration of M for input code(%, <) is o. We now represent any configuration 
C in the computation of M by a tuple C = (Co)ser of k-ary relations, where 


Co := {@: the G-th symbol of C is o}. 


The configuration at time t is the stage t+ 1 of a simultaneous pfp induc- 
tion on (%, <), defined by the rules 


CJ :=V2(9 < Z) V YZ(Z < 7) 


and, for alla € I — {#}, 


Coy := (8-0) AN vE-C;Z) V 


yer 


sae (z+ 1=9AGt+1=ZzA \V CaTa Co A C,2)) 
f(a, B, y)=0 


The first rule just says that each stage represents a word starting and ending 
with #. The other rules ensure that (1) if the given sequence C contains only 
empty relations (i.e. if we are at stage 0), then the next stage represents the 
input configuration, and (2) if the given sequence represents a configuration, 
then the following stage represents its successor configuration. 

By our convention, M accepts its input if and only the sequence of con- 
figurations becomes stationary (i.e. reaches a fixed point). Hence M accepts 
code(2, <) if and only if the relations defined by the simultaneous pfp in- 
duction on 2 of the rules described above are non-empty. Hence K is PFP- 
definable. 


An alternative characterization of PSPACE is possible in terms of the 
database query language while consisting essentially of first-order relational 
updates and while-loops. Vardi [99] proved that while captures PSPACE on 
ordered finite structures and Abiteboul and Vianu proved that while and PFP 
are equivalent on finite structures. 
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From the capturing results for PTIME and PSPACE we immediately obtain 
the result that PTIME = PSPACE if, and only if, LFP = PFP on ordered 
finite structures. The natural question arises of whether LFP and PFP can 
be separated on the domain of all finite structures. For a number of logics, 
separation results on arbitrary finite structures can be established by relatively 
simple methods, even if the corresponding separation on ordered structures 
would solve a major open problem in complexity theory. For instance, we 
have proved by quite a simple argument that DTC Ç TC, and it is also not 
very difficult to show that TC Ç LFP (indeed, TC is contained in stratified 
Datalog, which is also strictly contained in LFP; see Sect. 3.3.10). Further, it 
is trivial that LFP is less expressive than Xj on all finite structures. However 
the situation is different for LFP vs. PFP. 


Theorem 3.3.56 (Abiteboul and Vianu). LFP and PFP are equivalent 
on finite structures if, and only if, PTIME = PSPACE. 


3.3.10 Datalog and Stratified Datalog 


Datalog and its extensions are a family of rule-based database query languages 
that extend the conjunctive queries by a relational recursion mechanism sim- 
ilar to the one used in fixed-point logics. Indeed, as we shall see, Datalog can 
be seen as a fragment of least fixed point logic. For the purpose of this section 
we simply identify a relational database with a finite relational structure. This 
is not adequate for all aspects of database theory, but for the questions con- 
sidered here it is appropriate. For further information on databases, see [1], 
for example. 


Definition 3.3.57. A Datalog rule is an expression of the form H — Bı A 
--» A Bm, where H, the head of the rule, is an atomic formula Rū, and 
Bı A- -+ A Bm, the body of the rule, is a conjunction of literals (i.e. atoms or 
negated atoms) of the form St or =ST where %,7 are tuples of variables or 
constants. The relation symbol R is called the head predicate of the rule. 
We also allow Boolean head predicates. A Datalog rule is positive if it does 
not contain negative literals. 

A Datalog program JI is a finite collection of rules such that none of its 
head predicates occurs negated in the body of any rule. The predicates that 
appear only in the bodies of the rules are called input predicates. The input 
vocabulary of I is the set of input predicates and constants appearing in 
I. 


Example 3.3.58. The Datalog program reach consists of the three rules 
Tay — Ezry, Taz — Try A Tyz, Ry — Tay. 


The input vocabulary is {E,a}, and the head predicates are T and R. 
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Given a structure 2 over the input vocabulary, the program computes an 
interpretation of the head predicates, i.e. it defines an expansion (X) := 
(A, Ri,..., Rk) of A, where the R; are the values of the head predicates as 
computed by J. This interpretation can be defined in several equivalent ways, 
for instance via minimal-model semantics or fixed-point semantics. We can 
read a Datalog rule y, := RE — Bı A--- A Bm, and associate with the 
program IT the universal closure of the conjunction over these formulae: 


pl] :=vz N pr- 


pre 


We can compare expansions of A by componentwise inclusion of the addi- 
tional predicates: (X, R1,..., R) C (A, Ri....,R,) if Ri C Ri for all i. 
Acording to the minimal-model semantics, J (X4) is the minimal expansion 
(A, RY,..., RY) that satisfies p[I]]. 


Example 3.8.59. The formula associated with the program preach of Exam- 
ple 3.3.58 is 


VaVyV2((Tay — Exy) A (Taz — Try A Tyz) A (Ry — Tay)). 


The minimal expansion of a graph G = (V, E) with a distinguished node 
ais ITpeach(G,a) = (G,a,T, R) where T is the transitive closure of E and R 
is the set of points reachable by a path from a. 


Exercise 3.3.60. Prove that minimal-model semantics is well-defined: for ev- 
ery Datalog program IT and every input database 2, there is a unique minimal 
expansion of 2 that is a model of ~[IZ]. 


For the case of fixed-point semantics, we read a rule RE — G(Z,7%) 
as an update operator: whenever an instantiation 3(a,b) of the body of the 
rule is true for the current interpretation of the head predicates, make the 
corresponding instantiation Rā of the head true. Initially, let all head pred- 
icates be empty. At each stage, apply simultaneously the update operators 
for all rules of the program to the current interpretation of (Ri,..., Rx). It- 
erate this operation until a fixed point (RP°,..., RP) is reached. Now let 
TEA) := (A, RY,..., RỌ). 


Exercise 3.3.61. Prove that minimal-model semantics and fixed-point seman- 
tics coincide: for all M and A, (RY, ..., RE) = (R&,..., Re). 


Definition 3.3.62. A Datalog query is a pair (I, R) consisting of a Datalog 
program JT and a designated head predicate R of IT. With every structure 
A, the query (JJ, R) associates the result (Z, R)*, the interpretation of R as 
computed by JT from the input 2. 


We now relate Datalog to LFP. We shall show that each Datalog query 
(II, R) is equivalent to a formula W(%) € LFP, in fact one of very special form. 
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Let JI be a Datalog program with input vocabulary 7 and head predicates 
Rı,..., Rp. We first normalize the rules such that all rules with head pred- 
icate R; have the same head R;x,---ax,,. This can be done by appropriate 
substitutions in the rule body and by adding equalities. For instance, a rule 
Rryyx — GB(ax,y,z) can be rewritten as Ra xox3%4 — B(@1,22,y) A £3 = 
r2/A\ 24 = zı. We then have a program containing, for each head predicate R;, 
rules r;; of the form RiT — pij(£, Y), where 8i; is a conjunction of literals and 
equalities. We then combine the update operators associated with the same 
head predicate and describe the update of R; by the existential first-order 
formula 7;(T) := V; Wb: (7,7). As a consequence, the fixed-point semantics 
of IT is described by the system 


RıT := 71 
S:= : 
RkT := Yk 


of first-order update rules, and the query (J, R;) is equivalent to the formula 
(lfpR; : S](Z). Hence every Datalog query is equivalent to an LFP-formula, in 
which fixed-point operators are applied only to existential formulae. 


Definition 3.3.63. Existential fixed-point logic, denoted EFP, is the set 
of (simultaneous) LFP-formulae without universal quantifiers and without 
gfp-operators, and where negations are applied to atomic formulae only. 


We have seen that Datalog C EFP. The converse is also true, which can 
be established by a straightforward induction: with every formula % € EFP 
one associates a Datalog program Hy with a distinguished head predicate Hy, 
such that the query (My, Hy) is equivalent to ~. We leave the details as an 
exercise. 


Proposition 3.3.64. Datalog is equivalent to EFP. 


We know that LFP captures PTIME on ordered finite structures. The 
question arises of whether Datalog is sufficiently powerful to do the same. 
The answer depends on the precise variant of Datalog and on the notion of 
ordered structures that is used. We distinguish three cases. 

(1) A simple monotonicity argument shows that Datalog is weaker than 
PTIME on structures where only a linear order, but not a successor relation, is 
given. If X is a substructure of B, then (H, R)* C (IH, R)® for every Datalog 
query (I, R). Of course, there even exist very simple first-order queries that 
are not monotone in this sense. Note that this argument does break down on 
databases where a successor relation S (rather than just a linear order), and 
constants 0 and e for the first and last elements are given. Exercise: why? 

(2) In the literature, Datalog programs are often required to contain only 
positive rules, i.e. the input predicates also can be used only positively. This 
restricted variant is too weak to capture P'TIME, even on successor structures. 
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If input predicates can be used only positively, then queries are monotone 
under extensions of the input relations: if a database B is obtained from 2 
by augmenting some of the input relations, then again (H, R)* C (H, R)®. 


Exercise 3.3.65. Prove this monotonocity property, and give examples of 
first-order queries that cannot be defined by Datalog programs. 


(3) In the case of programs with negations of input predicates and 
databases with a successor relation and constants 0 and e for the first and last 
elements, we can capture PTIME by Datalog. This was originally established 
in [13, 91] and is implicit also in [67]. 


Theorem 3.3.66 (Blass, Gurevich, and Papadimitriou). On successor 
structures, Datalog (with negations of input predicates) captures PTIME. 


Proof. This result can be established in several ways, for instance by a re- 
duction from X1-HORN (making use of the fact that PTIME is closed under 
complement). Instead, we give a direct proof. 

It is clear that Datalog queries are computable in polynomial time. It 
remains to prove that every class K of finite successor structures that is rec- 
ognizable in PTIME can be defined by a Boolean Datalog query. 

Let M bea polynomial-time Turing machine with state set Q and alphabet 
X, recognizing (an encoding of) a successor structure 2 if and only if A € K. 
We denote the cardinality of the input structure 2 by n and assume that the 
computation time of M on % is less than n*. 

The construction is similar to the proof of Theorem 3.3.55. Configurations 
of M are represented by words #w1--- wi—1(qwi)Wi41°'*Wm—-1# over the 
alphabet r := XU (Q x X) U {#}, where m = në, and we describe the 
behaviour of M by a function f : T3 — I such that, for any configuration 
C = co: +: Cm, the successor configuration Next(C) = co- -cœ is determined 
by the rules 


Q=, =# and c= f(ci-1,Ci, Cip) for 1 <i<m-1. 


Let S be a 2k-ary relation symbol and let Is be a Datalog program with 
head predicate S, computing the successor relation on k-tuples (associated 
with the lexicographic order defined by the given successor relation). Recall 
that we can encode successor structures so that there exist quantifier-free 
formulae 3,(7) such that A H G,(@) if, and only if, the @th symbol of the 
input configuration of M for code(2) is øo. Let (Me, Ho) be a Datalog query 
equivalent to 3.(%). 

We represent the computation of M by a tuple C = (Co)oer of 2k-ary 
relations, where 


Co := {(G,t) : the ath symbol of the configuration at time f is o}. 


The Datalog program associated with M consists of 


184 3 Finite Model Theory and Descriptive Complexity 


(1) the program ITs defining the successor relation on k-tuples; 
(2) the programs IT, for describing the input; 
(3) the rules 


Cyet 
C,y0 — Hoy for all o € I — {#}; 
(4) for all a, 8, y,o with f(a, 3,7) = ø, the rule 
CsE — SEGA SYZA SEE A CoBt A Cst CZT; 
(5) the rule 
Acc — Cqw2t for any accepting state q and any symbol w. 


The first two rules in (3) say that each configuration starts and ends with 
#; the following set of rules ensures that the configuration at time 0 is the 
input configuration. The rules in (4) imply that from time 7 to time f = 
t+ 1 the computation proceeds as required by M, and the last rule makes 
the Boolean predicate Acc true if and only if an accepting state has been 
reached. Obviously, M accepts the input structure 2 if, and only if, the query 
(Im, Acc) evaluates to true on 2. 


Almost the same proof shows that the expression complexity of Datalog 
(and hence of LFP) is EXPTIME-complete (see also Theorem 3.3.36). 


Theorem 3.3.67. The evaluation problem for Datalog programs (with head 
predicates of unbounded arity) is complete for EXPTIME, even for programs 
with only positive rules, and for a fixed database with only two elements. 


Proof. By the results of Section 3.3.5, LFP-formulae, and hence also Datalog 
programs, can be evaluated in polynomial time with respect to the size of 
the input structure and in exponential time with respect to the length of the 
formula (or program). 

To prove completeness, we fix a database 2 with two elements and constant 
symbols 0, 1 (or, alternatively, two unary relations Py) = {0} and P, = {1}). 
Let M be a deterministic Turing machine that accepts or rejects input words 
w = wo: Wm-—1 E {0,1}* in time gm? (for some fixed d). For every input x 
for M, we construct a Datalog program IM, ,w which evaluates, on the fixed 
database 2, a Boolean head predicate Acc to true if, and only if, M accepts 
w. 

The construction is similar to that in the proof of Theorem 3.3.66, with 
the following two differences. Whereas in the previous proof k was fixed and 
n depended on the input, it is now the other way round, with n := 2 and 
k := mê. Further, the description of the input configuration is now simpler: 
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we just explicitly list the atomic facts defining the input configuration for the 
given input w. Note that this is the only part of the program that depends on 
w; the remaining rules depend only on M and the length of the input. Finally 
note that the program contains only positive rules. 


Stratified Datalog 


Datalog defines in a natural way queries that require recursion (such as tran- 
sitive closure), but is very weak in other respects, mainly because it does not 
include negation. 

There exist various possible ways to add negation to Datalog. 


Definition 3.3.68. A stratified Datalog program is a sequence IJ = 
(IIo,...,11,) of basic Datalog programs, which are called the strata of I, 
such that each of the head predicates of IT is a head predicate in precisely one 
stratum JJ; and is used as an input predicate only in higher strata I;, where 
j >i. In particular, this means that 


(1) if a head predicate of stratum II; occurs positively in the body of a rule 
of stratum I/;, then j < i, and 

(2) if a head predicate of stratum II; occurs negatively in the body of a rule 
of stratum J7;, then j < i. 


The semantics of a stratified program is defined stratum by stratum. The 
input predicates of a stratum H; are either input predicates of the entire 
program JT or are head predicates of a lower stratum. Hence, once the lower 
strata are evaluated, we can compute the interpretation of the head predicates 
of IT; as in the case of basic Datalog programs. 


Clearly the power of stratified Datalog is between that of Datalog and LFP, 
and hence stratified Datalog provides yet another formalism that captures 
PTIME on ordered structures. On unordered structures stratified Datalog is 
strictly more expressive than Datalog (as it includes all of first-order logic) 
but strictly less powerful than LFP. The main example separating LFP from 
Stratified Datalog is the GAME query, which defines the winning positions 
of Player 0 in a strictly alternating game. It is defined by the LFP formula 
(lfpWa . dy(Ery A V2z(Eyz — W2z)|(x). This involves a recursion through 
a universal quantifier, which in general cannot be done in stratified Datalog 
[31, 79]. 


Theorem 3.3.69 (Dahlhaus and Kolaitis). No stratified Datalog program 
can express the GAME query. Hence stratified Datalog Ç LFP. 


Example 3.3.70. Another interesting class of examples showing the limits of 
stratified Datalog is that of well-foundedness properties, or statements saying 
that on all infinite paths one will eventually hit a node with a certain property 
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P. These are typical statements in the field of verification (expressed in CTL 
by the formula AFP). 

In LFP, the well-foundedness of a partial order < would be expressed as 
VyllfpWy .Yz(x < y > Wa)|(y). The CTL-formula AFP is expressed in L, 
by uX.P V OX and in LFP by [lfpRa . Px V Vy(Exy > Ry)|(2). 

On finite structures, such properties are definable by stratified Datalog 
programs, since they are essentially negations of reachability problems for 
cycles. Indeed, AFP means that there is no path that eventually cycles and 
on which P is globally false. This can be expressed by the following stratified 
program: 


Try — 7~Px A Exy \ Py Taz — Try ^ Eyz \ 7Pz 
Sx — Tre Sx — nPzx A Exy ^ Sy 
Rr — 7ASx 


The first stratum computes the set T of all pairs of nodes (u,v) such that 
there exists a path from u to v on which P is false, and the set S of all nodes 
from which there exists such a path that eventually cycles. Here the finiteness 
of the graph is used in an essential way, because only this guarantees that 
every infinite path eventually reaches a cycle. The second stratum takes the 
complement of S. 

However, it can be shown that no stratified Datalog program can express 
such statements on infinite structures (even countable ones). 

Another variant of Datalog, called Datalog LITE, which can express all 
CTL properties and moreover admits linear-time evaluation algorithms (and 
which is incomparable with stratified Datalog), has been defined and studied 
in [45]. 


A stratified Datalog program is linear if in the body of each rule there is 
at most one occurrence of a head predicate of the same stratum (but there 
may be arbitrary many occurrences of head predicates from lower strata). 


Example 3.8.71. The program reach in Example 3.3.58 is not linear, but 
by replacing the second, non-linear rule Txz — Txy A Tyz by the linear rule 
Taz — Try Eyz we obtain an equivalent linear program. However, one pays 
a price for the linearization. The original program reaches the fixed point after 
O(log m) iterations, while the linear program needs m iterations, where m is 
the length of the longest path in the graph. 


Linear programs suffice to define transitive closures, so it follows by a 
straightforward induction that TC C linear stratified Datalog. The converse 
is also true (see [38, 46]). 


Proposition 3.3.72. Linear stratified Datalog is equivalent to TC. 


Corollary 3.3.73. On ordered structures, linear stratified Datalog captures 
NLOGSPACE. 
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3.4 Logics with Counting 


From the point of view of expressiveness, first-order logic has two main de- 
ficiencies: it lacks the power to express anything that requires recursion (the 
simplest example is transitive closure) and it cannot count, as witnessed by 
the impossibility to express that a structure has even cardinality, or, more 
generally, by the 0-1 law. We have already discussed a number of logics that 
add recursion in one way or another to FO (or part of it), notably the various 
forms of fixed-point logic. On ordered finite structures, some of these logics 
can express precisely the queries that are computable in PTIME or PSPACE. 
However, on arbitrary finite structures they do not, and almost all known ex- 
amples showing this involve counting. Whereas in the presence of an ordering, 
the ability to count is inherent in fixed-point logic, hardly any of this ability is 
retained in its absence. For instance, as LFP and PFP are fragments of L2 u, 
the 0-1 law also holds for them. 

Therefore Immerman proposed that counting quantifiers should be added 
to logics and asked whether a suitable variant of fixed-point logic with count- 
ing would suffice to capture PTIME. Although Cai, Fiirer and Immerman [23] 
eventually answered this question negatively, fixed-point logic with counting 
has turned out to be an important and robust logic, that defines a natural 
level of expressiveness and allows one to capture PTIME on interesting classes 
of structures. 


3.4.1 Logics with Counting Terms 


There are different ways of adding counting mechanisms to a logic, which 
are not necessarily equivalent. The most straightforward possibility is the 
addition of quantifiers of the form 327, 323, etc., with the obvious meaning. 
While this is perfectly reasonable for bounded-variable fragments of first- 
order logic or infinitary logic (see e.g. [58, 89]), it is not general enough for 
fixed-point logic, because it does not allow for recursion over the counting 
parameters 7 in quantifiers J2*z. In fact, if the counting parameters are fixed 
numbers, then adjoining the quantifiers J2’z does not give additional power to 
logics such as FO or LFP, since they are closed under the replacement of 42? 
by i existential quantifiers (where as their restrictions to bounded width are 
not). These counting parameters should therefore be considered as variables 
that range over natural numbers. To define in a precise way a logic with 
counting and recursion, one extends the original objects of study, namely 
finite (one-sorted) structures 2, to two-sorted auxiliary structures 2* with a 
second numerical (but also finite) sort. 


Definition 3.4.1. With any one-sorted finite structure 2% with universe A, we 
associate the two-sorted structure A* := A Ù ({0,...,|A]}; <,0,e), where < 
is the canonical ordering on {0,...,|A|}, and 0 and e stand for the first and 
the last element. Thus, we have taken the disjoint union of X% with a linear 
order of length |A| + 1. 
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We start with first-order logic over two-sorted vocabularies ø U {<, 0, e}, 
with semantics over structures 2* defined in the obvious way. We shall use 
Latin letters x, y,z,... for the variables over the first sort, and Greek letters 
A, H4, V,... for variables over the second sort. The two sorts are related by 
counting terms, defined by the following rule. Let y(x) be a formula with a 
variable x (over the first sort) among its free variables. Then #,[y] is a term 
in the second sort, with the set of free variables free(#..[y]) = free(y) — {x}. 
The value of #,[y] is the number of elements a that satisfy y(a). 

Counting logics of this form were introduced by Gradel and Otto [54] 
and have been studied in detail in [89]. We start with first-order logic with 
counting, denoted by (FO + C), which is the closure of two-sorted first-order 
logic under counting terms. Here are two simple examples that illustrate the 
use of counting terms. 


Example 3.4.2. On a undirected graph G = (V,E), the formula 
ViVy(#.|Exz] = #-[Eyz]) expresses the assertion that every node has the 
same degree, i.e., that G is regular. 


Example 3.4.3. We present below a formula y(E1, E2) € (FO + C) which ex- 
presses the assertion that two equivalence relations E and Ea are isomorphic; 
of course a necessary and sufficient condition for this is that for every i, they 
have the same number of elements in equivalence classes of size t: 


W(E1, E2) = (Vu) (olf ylEiry] = u] = #0l#yl[Eory] = ul). 


3.4.2 Fixed-Point Logic with Counting 


We now define (inflationary) fixed point logic with counting (IFP + C) 
and partial fixed point logic with counting (PFP + C) by adding to (FO 
+ C) the usual rules for building inflationary or partial fixed points, ranging 
over both sorts. 


Definition 3.4.4. Inflationary fixed point logic with counting, (IFP + C), is 
the closure of two-sorted first-order logic under the following rules: 


(1) The rule for building counting terms. 

(2) The usual rules of first-order logic for building terms and formulae. 

(3) The fixed-point formation rule. Suppose that Y(R, z, F) is a formula of 
vocabulary T U {R} where © = z1,..., £k, D = jn,---,pe, and R has 
mixed arity (k, £), and that (u, T) is a k + 4-tuple of first- and second-sort 
terms, respectively. Then 


lifp RIT. 4), P) 


is a formula of vocabulary T. 
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The semantics of [ifp RTA . y] on A* is defined in the same way as for the 
logic IFP, namely as the inflationary fixed point of the operator 


Fy: Rr RU {(ā,1) | (X*, R) EWG}. 


The definition of (PFP + C) is analogous, where we replace inflationary 
fixed points by partial ones. In the literature, one also finds different variants 
of fixed-point logic with counting where the two sorts are related by counting 
quantifiers rather than counting terms. Counting quantifiers have the form 
(Ji x) for ‘there exist at least i x’, where 7 is a second-sort variable. It is 
obvious that the two definitions are equivalent. In fact, (IFP + C) is a very 
robust logic. For instance, its expressive power does not change if one permits 
counting over tuples, even of mixed type, i.e. terms of the form #z7. One 
can of course also define least fixed-point logic with counting, (LFP + C), 
but one has to be careful with the positivity requirement (which is more 
natural when one uses counting quantifiers rather than counting terms). The 
equivalence of LFP and IFP readily translates to (LFP + C) = (IFP + ©). 
Further, there are a number of other logical formalizations of the concept of 
inductive definability with counting that turn out to have the same expressive 
power as (IFP + C) (see [54] and Sect.3.4.3 below for details). 


Example 3.4.5. An interesting example of an (IFP + C)-definable query is the 
method of stable colourings for graph-canonization. Given a graph G with a 
colouring f : V > 0,...,r of its vertices, we define a refinement f’ of f, giving 
to a vertex x the new colour f’x = (fx,n1,...,n,) where ni = #y[ExyA(fy = 
i)]. The new colours can be sorted lexicographically so that they again form 
an initial subset of N. Then the process can be iterated until a fixed point, 
the stable colouring of G is reached. It is easy to see that the stable colouring 
of a graph is polynomial-time computable and uniformly definable in (IFP + 
C). 

On many graphs, the stable colouring uniquely identifies each vertex, i.e. 
no two distinct vertices get the same stable colour. This is the case, for in- 
stance, for all trees. Further, Babai, Erdös, and Selkow [8] proved that the 
probability that this happens on a random graph with n nodes approaches 1 
as n goes to infinity. Thus stable colourings provide a polynomial-time graph 
canonization algorithm for almost all finite graphs. 


We now discuss the expressive power and evaluation complexity of fixed- 
point logic with counting. We are mainly interested in (IFP + C)-formulae 
and (PFP + C)-formulae without free variables over the second sort, so that 
we can compare them with the usual logics without counting. 


Exercise 3.4.6. Even without making use of counting terms, IFP over two- 
sorted structures 2* is more expressive than IFP over 2. To prove this, con- 
struct a two-sorted IFP-sentence w such that A* = w if, and only if, |A| is 
even. 


190 3 Finite Model Theory and Descriptive Complexity 


It is clear that counting terms can be computed in polynomial-time. Hence 
the data complexity remains in PTIME for (IFP + C) and in PSPACE for 
(PFP + C). We shall see below that these inclusions are strict. 


Theorem 3.4.7. On finite structures, 


(1) IFP ¢ (IFP + C) ¢ PTIME. 
(2) PFP Ç (PFP + C) ¢ PSPACE. 


Infinitary Logic with Counting 


Let CX „ be the infinitary logic with k variables LE „, extended by the quanti- 


Cow? 


fiers J=™ (‘there exist at least m’) for all m € N. Further, let C2 u := Up Chu- 


Proposition 3.4.8. (IFP + C) C C27- 


Due to the two-sorted framework, the proof of this result is a bit more 
involved than for the corresponding result without counting, but not really 
difficult. We refer to [54, 89] for details. 

The separation of (IFP + C) from PTIME has been established by Cai, 
Fürer, and Immerman [23]. The proof also provides an analysis of the method 
of stable colourings for graph canonization. We have deswcribed this method 
in its simplest form in Example 3.4.5. More sophisticated variants compute 
and refine colourings of k-tuples of vertices. This is called the k-dimensional 
Weisfeiler-Lehman method and, in logical terms, it amounts to labelling 
each k-tuple by its type in k + 1-variable logic with counting quantifiers. 
It was conjectured that this method could provide a polynomial-time algo- 
rithm for graph isomorphism, at least for graphs of bounded degree. However, 
Cai, Fiirer, and Immerman were able to construct two families (Gp)nen and 
(Hn)nen of graphs such that on one hand, Gn and H, have O(n) nodes 
and degree three, and admit a linear-time canonization algorithm, but on the 
other hand, in first-order (or infinitary) logic with counting, R(n) variables 
are necessary to distinguish between Gn and Hy. In particular, this implies 
Theorem 3.4.7. 


Inflationary vs. Partial Fixed-Points 


By Theorem 3.3.56, partial fixed-point logic collapses to inflationary fixed- 
point logic if, and only if, PTIME = PSPACE. The analogous result in the 
presence of counting is also true [54, 89]: PTIME = PSPACE <> (IFP + C) 
= (PFP + ©). 


3.4.3 Datalog with Counting 


Fixed-point formulae have the reputation of being difficult to read, and many 
people find formalisms such as Datalog easier to understand. In the presence of 
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a successor relation, Datalog (with negation over input predicates) is sufficient 
to capture PTIME and hence is equally expressive as LFP. In general, however, 
Datalog and even its most natural extensions, notably stratified Datalog, are 
weaker than LFP. 

Counting terms can also be added to Datalog. We conclude this section 
by discussing Datalog with counting. We show that (Datalog + C) is closed 
under negation and equivalent to (IFP + C). In the presence of counting, 
the common extensions of Datalog, notably stratified Datalog, are therefore 
equivalent to Datalog. 


Definition 3.4.9. Datalog with counting, denoted by (Datalog + C), ex- 
tends Datalog by allowing two-sorted head predicates and counting terms. 
The two-sorted head atoms have the form RTA, where T ranges over the first 
sort, i.e. over elements of the input database 2, and H ranges over the second 
sort. For any atom Ray we have a counting term #s|Reya]. A term over 
the second sort is called an arithmetical term. The arithmetical terms are 
either 0, e, counting terms, or t+1, where t is also an arithmetical term. Thus, 
a program in (Datalog + C) is a finite set of clauses of the form 


H— Bi A-:-A Bm 


where the head H is an atomic formula R(Z,7), and Bi,..., Bm are atomic 
formulae RTA or equalities of terms (over the first or the second sort). 


For every input database, the program computes intensional relations via 
the inflationary fixed-point semantics. Note that for classical Datalog pro- 
grams, it makes no difference whether the fixed-point semantics is defined 
to be inflationary or not, since the underlying operator is monotone anyway. 
However, for programs in (Datalog + C), the semantics has to be inflationary, 
since otherwise, the equalities of arithmetical terms give rise to non-monotone 
operators. For the same reason, the minimum-model semantics will no longer 
be defined. Since inflationary fixed-point semantics is one of the various equiv- 
alent ways to define the semantics of Datalog, both the syntax and the se- 
mantics of (Datalog + C) generalize Datalog in a natural way. 

One could also introduce counting in an (at first sight) more general form, 
namely by allowing counting terms of the form #¢.7|R@pyV]. While this may 
be convenient for writing a program in shorter and more understandable form, 
it does not affect the power of (Datalog + C). 


Exercise 3.4.10. [54] Prove that counting over tuples, even of mixed type, 
does not increase the expressive power of (Datalog + C). 


Hence cardinalities of arbitrary predicates can be equated in a Datalog 
program: we take the liberty of writing equalities such as |Q| = |R| in the 
body of a rule, for simplicity. The following technical lemma is essential for 
reducing (IFP + C) to (Datalog + C). 
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Lemma 3.4.11. Let IT be a (Datalog + C) program with head predicates 
Qi,.--,Qr. There exists another (Datalog + C) program II’, whose head pred- 
icates include Q,,...,Q, and a Boolean control predicate C* such that 


2 (IP, Qi) = (I, Qi) for all i; 
e (II',C*) is true on all databases and C* becomes true only at the last stage 
of the evaluation of II’. 


Proof. In addition to C*, we add a unary head predicate C° and, for every 
head predicate Q; of IT a new head predicate Q’ of the same arity. Then, H 
is obtained by adding the following clauses to IT: 


Cr 
Qian Qip forl<i<r 
C* — rA (Qil = |Q4|) A--- A Q = IQ) 


Observe that Q; simply lags one step behind Q;. The atom C°z is necessary 
to avoid the possibility that C* is set to true in the first stage. 


Lemma 3.4.11 essentially says that we can attach to any program a Boolean 
control predicate which becomes true when the evaluation of the program is 
terminated. We can then compose two Datalog programs while making sure 
that the evaluation of the second program starts only after the first has been 
terminated. As an initial application, we shall show that (Datalog + C) is 
closed under negation. 


Lemma 3.4.12. The complement of a (Datalog + C) query is also a (Datalog 
+ C) query. 


Proof. Let (I, Q) be a (Datalog + C) query, and let I’ be the program 
specified in Lemma 3.4.11. Take a new variable z, and new head predicates Q 


and R with arity(R) = arity(Q) and arity(Q) = arity(Q) + 1. Construct JI” 
by adding to I” the rules 


Qupz — Qui 
Rap — C* A (#.|Qzpz] = 0). 


The query (J7”, R) is the complement of (JT, Q). 


Difficulties in expressing negation are the reason why, in the absence of 
counting (or of an ordering), Datalog is weaker than fixed-point logic. Also, 
the limited form of negation that is available in Stratified Datalog (which 
does not allow for ‘recursion through negation’) does not suffice to express 
all fixed-point queries. (Datalog + C) does not have these limitations, and is 
equally expressive as (IFP + C). 


Theorem 3.4.13 (Gradel and Otto). (Datalog + C) = (IFP + ©). 
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It is obvious that (Datalog + C) C (IFP + C). For the converse, we can 
construct by induction, for every formula ~ € (IFP + C), a (Datalog + C) 
program JI, with goal predicate Qy such that (My, Qy) is equivalent to w. 


Exercise 3.4.14. For atomic formulae, disjunctions, and existential quan- 
tification the construction is obvious, and closure under negation has al- 
ready been proved. Complete the proof for applications of counting terms, 
ie formulae Y(J, m, v) := #2[9(%,9,—)] = v, and fixed point formulae 
y := [ifpRTE . Y(R, T, m0), 7). The construction makes use of Lemma 3.4.11. 


Example 3.4.15. To illustrate the expressive power of (Datalog + C) we show 
below a program for the GAME query (for strictly alternating games). The 
GAME query is the canonical example that separates LFP from Stratified 
Datalog [31, 79]. GAME is definable in fixed-point logic, by the formula 
(lfpWa . dy(Exy AVz(Eyz — Wz))|(x) that defines the winning positions for 
Player 0. 

Here is a (Datalog + C) program with goal predicate Z, defining GAME: 


Ware Exy\VypArX=ptl 
Fyzu— Eyz\Wzp 
Vyu — #2[Eyz] = #-[Fyzu] 
Zx — Wru 


The evaluation of this program on a game graph G assigns to W (or V) a 
set of pairs (x,u) € V x N, such that Player 0 has a winning strategy from 
position x in at most u moves when she (or Player 1, respectively) begins the 
game. 


3.5 Capturing PTIME via Canonization 


We have seen that there are a number of logics that capture polynomial time 
on ordered finite structures, but none of them suffices to express all of PTIME 
in the absence of a linear order. Indeed, it has been conjectured that no logic 
whatsoever can capture PTIME on the domain of all finite structures. We 
shall discuss this problem further at the end of this section. But, of course, 
even if this conjecture should turn out to be true, it remains an important 
issue to capture PTIME on other relevant domains besides ordered structures. 


3.5.1 Definable Linear Orders 


An obvious approach is to try to define linear orders and then apply the 
known results for capturing complexity classes on ordered structures. 
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Definition 3.5.1. Let D be a domain of finite structures and let L a logic. 
We say that D admits L-definable linear orders if, for every vocabulary 
T, there exists a formula ~(a,y,Z) € L(T) such that there exists in every 
structure 2% € D(T) a tuple Z for which the relation {(a, b) : A H w(a, b, a} is 
a linear order on A. The elements in € are called the parameters of the order 
defined by on 2. 


Example 3.5.2. Let D consist of all structures (A, E, Ri,...,) such that (A, E) 
is an undirected cycle. D admits LFP-definable linear orders (with two pa- 
rameters), via the formula 


W(a, Y, 21, 22) = Ezz A [lfpRay . (x = z1 Ay = 22)Viu(Raud Euy A y £ 21) 
Viu(Ruy A Euz A x 4 y| (x,y). 


Furthermore, straightforward automorphism arguments show that we cannot 
define linear orders with fewer than two parameters. 


Exercise 3.5.3. Let D be the domain of structures (A, E, R1,...,) such that 
(A, E) is isomorphic to a finite rectangular grid. Show that D admits LFP- 
definable linear orders. 


Exercise 3.5.4. Let K be a class of 7-structures with the following property. 
For every m € N, there exists a structure 2 € K such that for every m-tuple @ 
in 2 there exists a non-trivial automorphism of 2,a@. Then K does not admit 
definable orders in any logic. 


On any domain that admits LFP-definable linear orders, we can capture 
PTIME by using LFP-formulae that express polynomial-time properties on 
ordered structures, and modify them appropriately. 


Proposition 3.5.5. If D admits LFP-definable linear orders, then LFP cap- 
tures polynomial time on D. 


Proof. It only remains to show that every polynomial-time model class K C 
D(r) is L-definable. Let y(x, y,Z) be a formula defining a linear order on the 
structures in D(r). As LFP captures PTIME on ordered structures, there 
exists a formula ~ € LFP(r U {<}) such that, for every structure 2 € D(r) 
and every linear order < on A, we have that (2,<) H y iff A € K. It follows 
that 


z( “f(a,y) : p(x, y,Z)} is a linear order” ^ 
plu < v/(uv,2)]), 


U 


AEK => A 


l 


where ¢[u < u/y(u, v, Z)] is the formula obtained from 7 by replacing every 
atom of the form u < v by y(u, v,Z). 
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3.5.2 Canonizations and Interpretations 


Let S be any set and let ~ be an equivalence relation on S. A canonization 
function for (S,~) is a function f : S — S associating with every element a 
canonical member of its equivalence class. That means that f(s) ~ s for all 
s E€ S, and f(s) = f(s’) whenever s ~ s’. 

In finite model theory, we are interested in canonization algorithms for 
finite structures, either up to isomorphism or up to a coarser equivalence rela- 
tion, such as indistinguishability in some logic or bisimulation. As algorithms 
take encodings of structures as inputs, and as the encoding of a structure is 
determined by an ordering of its universe, we can view canonization of struc- 
tures as an operation that associates with every structure 21 an ordered one, 
say (W, <), such that W is equivalent to 2, and such that equivalent structures 
are mapped to the same ordered structure (and hence the same encoding). 

For a class K of structures, we write KS for the class of expansions (2, <) 
of structures 21 € K by some linear order. 


Definition 3.5.6. Let K be a class of finite 7-structures, and let ~ be an 
equivalence relation on K. A canonization function for ~ on XK is a function 
f:K—K< that associates with every structure 2 € K an ordered structure 
f(A) = (W, <) with A’ ~ A, such that F(A) S f(B) whenever A ~ B. 


Interpretations 


We are especially interested in canonizations that are defined by interpreta- 
tions. The notion of an interpretation is very important in mathematical logic, 
and for model theory in particular. Interpretations are used to define a copy 
of a structure inside another one, and thus permit us to transfer definability, 
decidability, and complexity results between theories. 


Definition 3.5.7. Let L be a logic, let o,r be vocabularies, where 7 = 
{Ri,..., Rm} is relational, and let r; be the arity of R;. A (one-dimensional) 
L|o, T]-interpretation is given by a sequence J of formulae in L(o) consisting 
of 


d(x), called the domain formula, 
e(x,y), called the equality formula, and, 
for every relation symbol R € 7 (of arity r), a formula yr(a1,..., £r). 


An Lio,r]-interpretation induces two mappings, one between structures, 
and the other between formulae. For a 7-structure 2 and a o-structure 8, we 
say that I interprets 2 in B (in short, I(B) = 2) if there exists a surjective 
map h: 6% — A, called the coordinate map, such that 


e for all b,c € ð”, 
BE e(b,c) — > h(b) = h(o); 
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e for every relation R of A and all bi,...,b, € ôP, 


BK yrlbi,..., bk) <> (h(b1),...,h(bg)) € R, 


i.e. ATIR) =O) Pree. 


Hence I = (,€,PRı;--:-; PR) defines (together with the function h : 
6% — A) an interpretation of A = (A, R1,..., Rm) in B if and only if e(x,y) 
defines a congruence on the structure (5%, va ees ye) and h is an isomor- 
phism from the quotient structure (5°, yR ,..., YR )/e” to A. 

Besides the mapping B + I(B) from o-structures to r-structures, I also 
defines a mapping from 7-formulae to o-formulae. With every 7-formula Y it 
associates a o-formula %7, which is obtained by relativizing every quantifier 
Qzx to (x), replacing equalities u = v by e(u,v), and replacing every atom 
Rū by the corresponding formula ypr(7). 


Lemma 3.5.8 (Interpretation Lemma). For every interpretation I and 
every structure Y, we have that 


AK yh => (A Ky. 


We shall omit 6 or £ from an interpretation if they are trivial, in 
the sense that (x) holds for all z and that e(x,y) is equivalent to 
x = y. The notion of an interpretation can be generalized in various 
ways. In particular, a k-dimensional interpretation is given by a se- 
quence (T), €(%,Y), Pr, (Z1; -< -Zr ) +++; PRm (E1,---;Er,,), where Z, Y, 71,... 
are disjoint k-tuples of distinct variables. A k-dimensional interpretation of 2 
in B represents elements of A by elements or equivalence classes of B}, rather 
than B. 


Exercise 3.5.9. Show that up to first-order interpretation, all finite struc- 
tures are graphs (see e.g. [66, Chapter 5] and [38, Chapter 11.2]). More pre- 
cisely, for every vocabulary 7, construct an FO[{£}, r]-interpretation I and 
an FO[r, {£}]-interpretation J such that, for every finite structure 2 (with at 
least two elements), [(2l) is a graph and J(J(2l)) = A. It then follows that for 
every model class K C Fin(r), K is decidable in polynomial time if, and only 
if, the class of graphs {J(2l) : A € K} is so. 


Definition 3.5.10. Let L be a logic and ~ an equivalence relation on a class 
K of r-structures. We say that (K, ~) admits L-definable canonization if 
there exists an L[r, 7U{<}]-interpretation 7 such that the function A œ I(2) 
is a canonization function for ~. For any domain D of structures, we say that 
(D,~) admits L-definable canonization if (D(T), ~) does for every vocabulary 
T. Finally, we say that D admits L-definable canonization if (D, =) does. 


Example 3.5.11. (Definable canonization versus definability of order.) 
Whenever D admits L-definable linear orders, and L is closed under first-order 
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operations, D also admits L-definable canonization. This is obvious if the 
formula y< defining the order has no parameters. If it uses parameters, then 
it may define, for each structure A, a family of ordered expansions (2, <). But 
these expansions can be compared by use of the lexicographic order of their 
encodings. As L is closed under first-order operations, the minimal expansion 
with respect to this lexicographical order is L-definable, which gives an L- 
definable canonization. 

Note, however, that there exist definable canonizations even in cases where 
no order is definable. Consider for instance the class of finite directed paths Pp 
(for n € N), and take their ‘double graphs’ (see Section 3.2.5), i.e. the graphs 
2P, = (V, E), where V = {0,...,n—1} x {0,1} and E = {((m,i), (m+ 1,5): 
0<m<n-1,i,7 € {0,1}}. On this class, no order is definable in any logic 
and with any finite number of parameters (to see this use Exercise 3.5.4). 
However, the class admits DTC-definable canonization. 

We shall explain the construction, which is uniform for all n, informally. 
The obvious equivalence relation on 2P,, where (m,i) ~ (m, j) iff m = m’, 
is first-order definable, and so P,, is interpretable in 2P,,. Further, the nodes 
0 and n — 1 are definable in Pp, and so (C;,,0), the directed n-cycle with 
a distinguished point, is interpretable in 2P,, as well. It therefore suffices to 
show that an ordered copy of 2P, is interpretable in (C,,,0). We represent 
nodes of 2P,, by edges and inverse edges of Cn: the node (m,0) is represented 
in Cn by the pair (m,m +1) and the node (m, 1) by the pair (m+1,m). The 
order on these pairs is 


(0,1) < (1,0) < (1,2) < (2,1) een a) < (n — 1,n — 2). 


The domain formula for the interpretation (of 2P, in Cn) is ô&(x, y) := 
ExyV Eyz. It is not difficult to see that the edge relation and the linear order 
are definable using DTC operators. The details are left to the reader. 


A simple but interesting example of definable canonization is tree canon- 
ization via fixed-point logic with counting. 


Proposition 3.5.12. The class of (directed) trees admits (IFP + C)-definable 
canonization. 


Proof. The interpretation J that we construct maps a tree T = (V, E) (with 
n nodes) to an ordered tree I(T) = ({1,...,n}, E’, <), where < is the natural 
order. That is, the interpretation is one-dimensional, maps nodes to numbers, 
and is defined by the formulae (u) := Iv(v < u), ye (u,v) := u < v, anda 
formula yz’ (u,v) that we do not explicitly construct. 

The construction of E” is based on an inductively defined ternary relation 
F C V x {1,...,n}? that encodes the sequence of binary relations F, := 
{(i, j) : (v,i, j) € F}. For each node v of T, let 7, denote the subtree of T 
with root v, and let S, be the graph ({1,...,|Z.|}, Fy). The construction will 
ensure that S, is isomorphic to Ty. 
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If v is a leaf, let F, = Ø. Suppose now that v has children v1,..., Um, and 
that the graphs S,,,...,Sy,, have already been constructed. To define S, we 
compute the code words w; = code(S,,,<) (where < is the natural order) 
and arrange them in lexicographic order. Now let S, be the graph with nodes 
1,...,|%|, obtained by first taking a copy of the S», with the smallest code 
word, then taking a copy of the second, and so on, and finally adding another 
node that is connected to the roots of the copies of the S,,. Obviously, S» 
determines Fy, and S, & Ty. 

It is clear that the inductive construction of F can be done via an (IFP + 


C)-formula Wr(x, u,v). Now take ye (u,v) := Izyr (zx, p, v). 


Theorem 3.5.13. Let D be a domain of (finite) structures, and let L be a 
logic that captures PTIME on DS. If D admits L-definable canonization, then 
L captures PTIME on D also. 


Proof. Let K € D(r) be a model class tht is decidable in polynomial time, 
and let Y € L(T U {<}) be a formula defining KS inside D<(r). Further, let 
I be an L[7,7 U {<}]-interpretation that defines a canonization on D(r). By 
the Interpretation Lemma, 


Aky o> A Ey — A EKS 4 AEK. 


Hence L captures PTIME on D. 


This result is important because it has been shown, in particular in the 
work of Grohe [58-60], that a number of interesting domains admit canoniza- 
tion via fixed-point logic with counting (IFP + C). Among these are 


(1) the domain of finite (labelled) trees (see Proposition 3.5.12); 

(2) the class of planar graphs [58] and, more generally, any domain of struc- 
tures, whose Gaifman graphs are embeddable in a fixed surface [59]; 

(3) any domain of structures of bounded tree width [60]. 


Corollary 3.5.14. (IFP + C) captures PTIME on any of these domains. 


Further, the results extend to domains that can be reduced to any of the 
domains mentioned above by simple definable operations such as adding or 
deleting a vertex or edge. An example is that of nearly planar (or apex) graphs, 
which become planar when one vertex is removed. 


3.5.3 Capturing PTIME up to Bisimulation 


In mathematics, we consider isomorphic structures as identical. Indeed, it 
almost goes without saying that relevant mathematical notions do not distin- 
guish between isomorphic objects. As classical algorithmic devices work on 
ordered representations of structures rather than the structures themselves, 
our capturing results rely on an ability to reason about canonical ordered 
representations of isomorphism classes of finite structures. 
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However, in many application domains of logic, structures are distin- 
guished only up to equivalences coarser than isomorphism. Perhaps the best- 
known example is the modelling of the computational behaviour of (concur- 
rent) programs by transition systems. The meaning of a program is usually 
not captured by a unique transition system. Rather, transition systems are 
distinguished only up to appropriate notions of behavioural equivalence, the 
most important of these being bisimulation. 

In such a context, the idea of a logic capturing PTIME gets a new twist. 
One would like to express in a logic precisely those properties of structures 
that are 


(1) decidable in polynomial time, and 
(2) invariant under the notion of equivalence being studied. 


Let us look at one specific problem in this context, the problem of 
bisimulation-invariant properties of transition systems. 


Definition 3.5.15. Let G = (V, (Ea)aca, (Ps)nbep) and G = 
(V',(El)aca,(P/)becg) be two transition systems of the same vocabu- 
lary. A bisimulation between G and G’ is a non-empty relation Z C V x V’, 
respecting the P, in the sense that v € P, iff v' € Pi, for all b € B and 
(v, v’) € Z, and satisfying the following back and forth conditions. 


Forth. for all (v, v’) € Z, a E€ A and every w such that (v, w) € Ea, there 
exists a w’ such that (v’,w’) € E’ and (w, w’) € Z. 

Back. for all (v, v’) € Z, a € A and every w’ such that (v’,w’) € E}, there 
exists a w such that (v, w) € Ea and (w, w’) € Z. 


A rooted transition system is a pair (G, u), where G is a transition 
system G and u is a node of G. Two rooted transition systems (G, u) and 
(G’,u’) are bisimilar, denoted by G, u ~ G’,w’, if there is a bisimulation Z 
between G and G’ with (u, u’) € Z. 


Exercise 3.5.16. Bisimulation is a greatest fixed point. Prove that two nodes 
u, u’ of a transition system G are bisimilar, i.e. (G, u) ~ (G, u’) if, and only 
if, 
GE [gfpRzy . \ Pyz > Ppy^ 

bEB 

\ (Va. Egxa’)(Ay’. Eayy’)Ra'y'A 

acA 

\ (Vy'. Eayy’)(Ga’ . Faxa’) Rz'y' (u, u’). 

acA 


A class S' of rooted transition systems is invariant under bisimulation 
if, whenever (G, u) € S and (G, u) ~ (G’,u’), then also (G’,u’) € S. We say 
that a class S of finite rooted transition systems is in bisimulation-invariant 
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PTIME if it is invariant under bisimulation, and if there exists a polynomial- 
time algorithm deciding whether a given pair (G, u) belongs to S. A logic L is 
invariant under bisimulation if all Z-definable properties of rooted transition 
systems are. 


Exercise 3.5.17. Prove that ML, the modal p-calculus L,,, and the infinitary 
modal logic ML” are invariant under bisimulation. 


Clearly, L, C bisimulation-invariant PTIME. However, as pointed out in 
Section 3.3.3, L, is far too weak to capture this class, mainly because it is 
essentially a monadic logic. Instead, we have to consider a multidimensional 
variant LY of Lu. 

But before we define this logic, we should explain the main technical step, 
which relies on definable canonization, but of course with respect to bisimu- 
lation rather than isomorphism. For simplicity of notation, we consider only 
transition systems with a single transition relation Æ. The extension to the 
case of several transition relations Ea is completely straightforward. 

With a rooted transition system G = (V, E, (P,)eceB), u, we associate a 
new transition system 


Gu = (Vo, E”, (PF eB), 


where V,~ is the set of all ~-equivalence classes [v] of nodes v € V that are 
reachable from u. More formally, let [v] denote the bisimulation equivalence 
class of a node v € V. Then 


Vo := {[v] : there is a path in G from u to v} 
Py := {[v] E VI : v € P} 
E~ := {([v], [w]) : (v, w) € E}. 


As shown in the following exercise, the pair GX, [u] is, up to isomorphism, 
a canonical representant of the bisimulation equivalence class of G, u. 


Exercise 3.5.18. Prove that (1) (G, u) ~ (G7, [u]), and (2) if (G, u) ~ (H, v), 
then (G7, [u]) = (H7, [v]. 


It follows that a class S of rooted transition systems is bisimulation- 
invariant if and only if S = {(G, u) : (GJ, [u]) E€ S}. Let CR~ be the domain 
of canonical representants of finite transition systems, i.e. 


CRY := {(G,u) : (GÙ, [u]) = (G, u)}. 
Proposition 3.5.19. CR“ admits LFP-definable linear orderings. 


Proof. We show that for every vocabulary T = {E}U{P, : b € B}, there exists 
a formula %(x,y) € LFP(7T) which defines a linear order on every transition 
system in CRY (T). 
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Recall that bisimulation equivalence on a transition system is a greatest 
fixed point. Its complement, bisimulation inequivalence, is a least fixed point, 
which is the limit of an increasing sequence 7; defined as follows: u £o v if u 
and v do not have the same atomic type, i.e. if there exists some b such that 
one of the nodes u,v has the property P, and the other does not. Further, 
u 441 v if the sets of ~;-classes that are reachable in one step from u and v 
are different. The idea is to refine this inductive process, by defining relations 
<; that order the ~;-classes. On the transition system itself, these relations 
are pre-orders. The inductive limit < of the pre-orders <; defines a linear 
order of the bisimulation equivalence classes. But in transition systems in 
CRY, bisimulation classes have only one element, so < actually defines a 
linear order on the set of nodes. 

To make this precise, we choose an order on B and define <o by enumer- 
ating the 2!7! atomic types with respect to the propositions Py, i.e. 


£ <o Y := Vy (“Poe A Poy ^ \ Pato Pyy). 
bEB b <b 

In what follows, x ~; y can be taken as an abbreviation for ~(x <i yVy <i 
x), and similarly for x ~ y. We define x <i+ı y by the condition that either 
£ <i y, or x ~; y and the set of ~;-classes reachable from z is lexicographically 
smaller than the set of ~;-classes reachable from y. Note that this inductive 
definition of < is not monotone, so it cannot be directly captured by an LFP- 
formula. However, as we know that LFP = IFP, we can use an IFP-formula 
instead. Explicitly, < is defined by [ifpx < y . W(~«, x, y)|(x, y), where 


P(x, z, y) =£ xo y V (x ~ y^ 
(Ay’. Eyy’) (va . Exg')ja f£ y^ 
(Vz.z < y') (Ix (Err" Ax” ~ z) e 


Jy” (Eyy” Ay" ~ 2)))). 


Exercise 3.5.20. Complete the proof by showing that the formula [ifpx < y . 
w(K, x, y)| (x,y) indeed defines the order described above. 


Corollary 3.5.21. On the domain CR~, LFP captures PTIME. 


In fact, this result already suffices to give an abstract capturing result for 
bisimulation-invariant PTIME (in the sense of the following section): by com- 
posing the mapping from rooted transition systems to their canonical rep- 
resentants with LFP queries on these representants, we obtain an abstract 
logic with recursive syntax and polynomial-time semantics that describes 
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precisely the polynomial-time computable, bisimulation-invariant queries on 
rooted transition systems. 

In many situations (such as for polynomial time on arbitrary finite struc- 
tures), we would actually be quite happy with such an abstract capturing 
result. However, in the bisimulation-invariant scenario we can do better and 
capture PTIME in terms of a natural logic, the multidimensional p-calculus 
Le 


Definition 3.5.22. The syntax of the k-dimensional ju-calculus Lt (for 
transition systems G = (V, E, (Pp)bez)) is the same as the syntax of the usual 
p-calculus L, with modal operators (i), [i] fora € A,i=1,...,k, and (ø), [ø] 
for every substitution ø : {1,...,k} — {1,...,&}. Let S(k) be the set of all 
these substitutions. 

The semantics is different, however. A formula w of Li is interpreted on a 
transition system G = (V, E,(P,)sez) at node v by evaluating it as a formula 
of L, on the modified transition system 


G* = (V*, (Ei) cick, (Eo)oes(h), (Poise B,1<ick) 
at node v := (v,v,...,v). Here VE =V x- x V and 
E; := {(0,0) € V*" x V* : (vi, wi) € E and vj = w; for j Æ i} 
Es := {(0,0) € VF x V* : wi = voq) for all i} 
P,i:= {0€ VE: uE Py} 


That is, G,v Ere y iff G*,(v,...,v) Fx, Y. The multidimensional u- 


eee k 
calculus is Lg = U2, Li 


Remark. Instead of evaluating a formula w € Lt at single nodes v of G, we 
can also evaluate it at k-tuples of nodes: G,U Ers w iff GF, T FL, Y- 


Example 3.5.23. Bisimulation is definable in L? (in the sense of the remark 
just made). Let 


For every transition system G, we have that G, v1, v2 — Wv~ if, and only if, vı 
and və are bisimilar in G. Further, we have that 


G,v E UY. (2)(W~ v (2)Y) 


if, and only if, there exists in G a point w that is reachable from v (by a path 
of length > 1) and bisimilar to v. 


Exercise 3.5.24. Prove that L7 is invariant under bisimulation. Further, show 
that L% can be embedded in LFP. 
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This exercise establishes the easy direction of the desired result: Li C 
bisimulation-invariant PTIME. For the converse, it suffices to show that LFP 
and Lit are equivalent on the domain CR”. Let S be a class of rooted transi- 
tion systems in bisimulation-invariant PTIME. For any (G, u), we have that 
(G,u) € S if its canonical representant (G7, [u]) € S. If LFP and Li are 
equivalent on CRY, then there exists a formula 7 € LY such that G7, [u] = v 
iff (GY, [u]) € S. By the bisimulation invariance of 4, it follows that G, u = % 
iff (G,u) € S. 


Proposition 3.5.25. On the domain CR~, LFP < Li. More precisely, for 
each formula w(a1,...,2k41) E LFP of width < k +1, there exists a formula 
v* € LẸ" such that for each (G,u) € CR~, we have that G = w(u,0) iff 
G, u, U = y*. 


Note that although, ultimately, we are interested only in formulae y(x) 
with just one free variable, we need more general formulae, and evaluation 
of L% -formulae over k-tuples of nodes, for the inductive treatment. In all 
formulae, we shall have at least xı as a free variable, and we always interpret 
zı as u (the root of the transition system). We remark that, by an obvious 
modification of the formula given in Exercise 3.5.23, we can express in L; the 
assertion that x; ~ x; for any i, j. 

Atomic formulae are translated from LFP to Lẹ according to 


(XTo(1) +++ Lo(r) 


Boolean connectives are treated in the obvious way, and quantifiers are 
translated by use of fixed points. To find a witness x; satisfying a formula 
w, we start at u (ie. set x; = x1), and search along transitions (i.e. use the 
-expression for reachability). That is, let 7/1 be the substitution that maps 
j to 1 and fixes the other indices, and translate Jz,7)(Z) into 


(9/1) mY .y* V (JY. 


Finally, fized points are first brought into normal form so that variables appear 
in the right order, and then they are translated literally, i.e. [IfpXZ . ](Z) 
translates into uX .~*. 

The proof that the translation has the desired property is a straightforward 
induction, which we leave as an exercise (see [90] for details). Altogether we 
have established the following result. 


Theorem 3.5.26 (Otto). The multidimensional -calculus captures 
bisimulation-invariant PTIME. 


204 3 Finite Model Theory and Descriptive Complexity 


Otto has also established capturing results with respect to other equiva- 
lences. For finite structures 2, B, we say that 2 =, % if no first-order sentence 
of width k can distinguish between 2 and B. Similarly, A =] B if A and B are 
indistinguishable by first-order sentences of width k with counting quantifiers 
of the form 32*z, for any i € N. 


Theorem 3.5.27 (Otto). There exist logics that effectively capture =o- 
invariant PTIME and =§-invariant PTIME on the class of all finite struc- 
tures. 


For details, see [89]. 


3.5.4 Is There a Logic for PTIME? 


To discuss the problem of whether PTIME can be captured on the domain 
of all finite structures, we need to make precise the notion of a logic, and 
to refine the notion of a logic capturing a complexity class, so as to exclude 
pathological examples such the following, which is due to Gurevich [61]. 


Example 3.5.28. Let the syntax of our ‘logic’ consist of all pairs (M, k), where 
M is a Turing machine, and k a natural number. A finite 7-structure 2 is a 
model of (M, k) if there exists a model class K C Fin(r) such that X € K, and 
M accepts an encoding code(%, <) of a finite 7-structure B in time |B|* if, 
and only if, B € K. Note that this ‘logic’ captures PTIME on finite structures. 
But the example is pathological, not mainly because of its unusual format, 
but because its semantics is not effective: it is undecidable whether a Turing 
machine accepts an isomorphism-closed class of structures. 

Another example of this kind is order-invariant LFP. The 7-sentences of 
this logic are the LFP-sentences of vocabulary 7U{<} such that, for all finite 
7-structures 2 and all linear orders <, <’ on 2, we have that (A, <) H a if and 
only if (XA, <’) H y. This defines the syntax. The semantics is the obvious one: 
a structure 2 is a model of ~ if, and only if, (X, <) = w for some, and hence 
all, linear orders on 2. This ‘logic’ also captures PTIME, but again it has 
an undesirable feature: it is undecidable whether a given sentence y € LFP 
is order-invariant (compare Exercise 3.1.12), so the ‘logic’ does not have an 
effective syntax. 


We start by defining a general notion of a logic on finite structures by 
imposing two requirements: an effective syntax and an isomorphism-invariant 
semantics. 


Definition 3.5.29. A logic on a domain D of finite structures is a pair (L, =), 
where L is a function that assigns to each vocabulary 7 a decidable set L(r) 
(whose elements are called t-sentences), and | is a binary relation between 
sentences and finite structures, so that for each sentence Y € L(r), the class 
{A E€ D(r) : A H Y} is closed under isomorphism. 
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Recall that, by Definition 3.2.10, a logic captures PTIME on a domain 
D if every polynomial-time decidable model class in D is definable in that 
logic, and if, for every sentence of the logic, the model-checking problem on D 
can be solved in polynomial time. To exclude pathological examples such the 
first one above, we impose in addition the condition that for each sentence, a 
polynomial-time model-checking algorithm can be effectively constructed. 


Definition 3.5.30. A logic (L, =) effectively captures PTIME on a do- 
main D of finite structures if it captures PTIME in the sense of Defini- 
tion 3.2.10 and, moreover, there exists a computable function, which associates 
with every sentence Y € L(r) an algorithm M and a polynomial p, such that 
M decides {XA € D(r) : A = y} in time p(n). We simply say that (L,-) 
effectively captures PTIME if it does so on the class of all finite structures. 


This definition can be modified in the obvious way to other complexity 
classes. All capturing results that we have proved so far are effective in this 
sense. 


Exercise 3.5.31. A complexity class C is recursively indexable on a domain 
D if there is a recursive index set I, a computable function f mapping every 
i € I to (the code of) a Turing machine M;, and an appropriate resource 
bound (e.g. a polynomial bounding the running time of M;) such that: 


(1) The class K; of all structures from D accepted by M; is in C, and, more- 
over, M; together with the given resource bound witnesses the member- 
ship of K; in the complexity class C. 

(2) For each model class K € C on the domain D, there is an i € I such that 
M; decides K. 


Prove that there is a logic that effectively captures C on the domain D if, and 
only if, C is recursively indexable on D. 


The above definition of a logic may seem too abstract for practical pur- 
poses. However, it is justified by the equivalence with recursive indexings, as 
described in the exercise above, and by a result of Dawar [32], which shows 
that if there is any logic that effectively captures PTIME, then there also 
exists a natural one. More precisely, Dawar proved that, from any logic effec- 
tively capturing PTIME, one could extract a model class K that is complete for 
PTIME under first-order reductions. As a consequence, PTIME would also be 
effectively captured by the logic FO[Q¢], which adjoins to FO the vectorized 
Lindst6m quantifiers associated with K (see [32, 38] for more information). 


Exercise 3.5.32. Many finite-model theorists conjecture that there is no logic 
that effectively captures PTIME on finite structures. If you are the first to 
prove this, you may win one million dollars. Why? 
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3.6 Algorithmic Model Theory 


3.6.1 Beyond Finite Structures 


For a long time, descriptive complexity theory has been concerned almost 
exclusively with finite structures. Although important problems remain open, 
the relationship between definability and complexity on finite structures is now 
fairly well understood, and there are interesting connections to fields such as 
databases, knowledge representation, and computer-aided verification. 

However, for many applications, the strict limitation to finite structures 
is too restrictive. In most of the fields mentioned above, there have been 
considerable efforts to extend the relevant methodology from finite struc- 
tures to suitable domains of infinite ones. In particular, this is the case for 
databases and computer-aided verification where infinite structures (like con- 
straint databases or transition systems with infinite state spaces) are of in- 
creasing importance. 

Finite model theory should therefore be generalized to a more compre- 
hensive algorithmic model theory that extends the research programme, the 
general approach, and the methods of finite model theory to interesting do- 
mains of infinite structures. From a more general theoretical point of view, 
one may ask what domains of infinite structures are suitable for such an ex- 
tension. More specifically, one may ask what conditions must be satisfied by a 
domain D of structures that are not necessarily finite such that the approach 
and methods of finite model theory make sense. There are two obvious and 
fundamental conditions: 


Finite representations. Every structure 21 € D should be representable in a 
finite way (e.g. by a binary string, an algorithm, a collection of automata, 
an axiomatization in some logic, an interpretation, ...). 

Effective semantics. For the relevant logics (e.g. first-order logic), the model- 
checking problem on D should be decidable. That is, given a sentence 
w € L and a representation of a structure 2 € D, it should be decidable 
whether XA w. 


These are just minimal requirements, which may need to be refined accord- 
ing to the context and the questions to be considered. We may, for instance, 
also require the following: 


Closure. For every structure A € D and every formula Y(T), the expansion 
(2,0) of A with the relation defined by y, should as well be contained 
in D. 

Effective query evaluation. Suppose that we have fixed a way of representing 
structures. Given a representation of y € D and a formula Y(T), we should 
be able to compute a representation of Y% (or of the expanded structure 
(2,4). 

Note that, contrary to the case of finite structures, query evaluation does not 

necessarily reduce to model checking. Further, instead of just effectiveness 
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of these tasks, it may be required that they can be performed within some 
complexity bounds. 


3.6.2 Finitely Presentable Structures 


We briefly survey here some domains of infinite but finitely presentable struc- 
tures which may be relevant to algorithmic model theory. We shall then discuss 
in a more detailed way metafinite structures, for which descriptive complexity 
issues have already been studied quite intensively. 

Recursive structures are countable structures whose functions and re- 
lations are computable and therefore finitely presentable. They have been 
studied quite intensively in model theory since the 1960s (see e.g. [6, 42]). Al- 
though recursive model theory is very different from finite model theory, there 
have been some papers studying classical issues of finite model theory on re- 
cursive structures and recursive databases [50, 64, 65, 94]. However, for most 
applications, the domain of recursive structures is far too large. In general, 
only quantifier-free formulae admit effective evaluation algorithms. 

Constraint databases provide a database model that admits infinite 
relations that are finitely presented by quantifier-free formulae (constraints) 
over some fixed background structure. For example, to store geometrical data, 
it is useful not just to have a finite set as the universe of the database, but 
to include all real numbers ‘in the background’. Also, the presence of inter- 
preted functions on the real numbers, such as addition and multiplication, is 
desirable. The constraint database framework introduced by Kanellakis, Ku- 
per, and Revesz [74] meets both requirements. Formally, a constraint database 
consists of a context structure A, such as (R, <,+,-), and a set {y1,...,Qm} of 
quantifier-free formulae defining the database relations. Constraint databases 
are treated in detail in [81] and in Chap. 5 of this book. 

Automatic structures are structures whose functions and relations 
are represented by finite automata. Informally, a relational structure 2 = 
(A, Ri,..., Rm) is automatic if we can find a regular language Ls C X* (which 
provides names for the elements of XA) and a function v : Ls — A mapping 
every word w €E L to the element of 2 that it represents. The function v must 
be surjective (every element of A must be named) but need not be injective 
(elements can have more than one name). In addition, it must be recognizable 
by finite automata (reading their input words synchronously) whether two 
words in Ls name the same elements, and, for each relation R; of A, whether 
a given tuple of words in Ls names a tuple in Ri. 


Example 3.6.1. (1) All finite structures are automatic. 

(2) Some important examples of automatic structures are Presburger arith- 
metic (N, +), and its expansions N, := (N, +, |p) by the relation x |, y which 
says that x is a power of p dividing y. Using p-ary encodings (starting with 
the least significant digit), it is not difficult to construct automata recognizing 
equality, addition, and |p. 
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(3) For p € N, let Tree(p) := ({0,...,p—1}*, (ai)icp, <, el), where oj (x) := 
xi, x < y means that zz = y for some z, and el(x, y) means that x and y have 
equal length. Obviously, these structures are automatic as well. 


Automatic structures provide a vast playground for finite-model theorists, 
with many examples of high relevance to computer science. There are also in- 
teresting connections to computational group theory, where automatic groups 
have already been studied quite intensively [41, 44]. The general notion of 
structures presentable by automata was proposed in [75], and their theory 
has been developed in [16, 18, 19, 92]. 

The notion of an automatic structure can be modified and generalized in 
many directions. By using automata over infinite words, we obtain the notion 
of w-automatic structures (which, unlike automatic structures, may have 
uncountable cardinality). 


Example 3.6.2. (1) All automatic structures are w-automatic. 
(2) The additive group of reals, (R, +), and indeed the expanded structure 
Rp := (R, +, <, |p, 1) are w-automatic, where 


£ |p y iff x= p” and y= kz for some n,k € Z. 


(3) The tree structures Tree(p) can be extended in a natural way 
to the (uncountable) w-automatic structures Tree*(p) = ({0,...,p — 
1}$”, (ai)icp, X, el). 


Unlike the class of recursive structures, automatic structures and w- 
automatic structures admit effective (in fact, automatic) evaluation of all 
first-order queries and possess many other pleasant algorithmic properties. 


Theorem 3.6.3. The model checking problems for first-order logic on the do- 
mains of automatic or w-automatic structures are decidable. 


There are a number of extensions of this result, for instance to the ex- 
tension of first-order logic by the quantifier ‘there exist infinitely many’ 
[19]. There also are model-theoretic characterizations of automatic and w- 
automatic structures, in terms of interpretations into appropriate expansions 
of Presburger arithmetic, trees, or the additive group of reals (see Exam- 
ples 3.6.1 and 3.6.2). We write A <po B to denote that there exists a first- 
order interpretation of 2 in B. Note that the domains of automatic and w- 
automatic structures are closed under fist-order interpretations. 


Theorem 3.6.4 (Blumensath and Gradel). (1) For every structure A, the 
following are equivalent: 


(i) A is automatic. 
(ii) U<po Np for some (and hence all) p > 2. 
(iii) A <po Tree(p) for some (and hence all) p > 2. 
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(2) For every structure A, the following are equivalent: 


(i) A is w-automatic. 
(ii) A <ro Rp for some (and hence all) p > 2. 
(iii) A <po Tree (p) for some (and hence all) p > 2. 


For a proof, see [19] There are similar characterizations for tree-automatic 
structures [16]. For further results on automatic structures, see [10, 16, 18, 19, 
75-78, 92]. 

The model-theoretic characterizations of automatic and w-automatic 
structures in terms of interpretability suggest a general way to obtain other 
domains of infinite structures that may be interesting for algorithmic model 
theory: fix a structure % with ‘nice’ (algorithmic and/or model-theoretic) 
properties and an appropriate notion of interpretation, and consider the class 
of all structures that are interpretable in 2. Obviously, each structure in this 
class is finitely presentable (by an interpretation). Further, many ‘nice’ prop- 
erties are preserved by interpretations, and so every structure in the class 
inherits them from 2. In particular, every class of queries that is effective 
on X and closed under first-order operations is effective on the closure of 2 
under first-order interpretations. This approach is also relevant to the domain 
of structures that we discuss next. 

Tree-interpretable structures are structures that are interpretable in 
the infinite binary tree T? = ({0,1}*,00,01) via a (one-dimensional) MSO- 
interpretation. By Rabin’s Theorem, monadic second-order formulae can be 
effectively evaluated on T?. Since MSO is closed under one-dimensional in- 
terpretations, the Interpretation Lemma implies that tree-interpretable struc- 
tures admit effective evaluation for MSO. Tree-interpretable structures gen- 
eralize various notions of infinite graphs that have been studied in logic, au- 
tomata theory and, verification. Some examples are context-free graphs 
(87, 88], which are the configuration graphs of pushdown automata, HR- 
equational and VR-equational graphs [27], which are defined via certain 
graph grammars, and prefix-recognizable graphs [25], which can for in- 
stance be defined as graphs of the form (V, (Ea)aca), where V is a regular 
language and each edge relation Ea is a finite union of sets X(Y x Z) = 
{(ry,vz):2€ X,y € Y,z € Z}, for regular languages X,Y, Z. In fact, some 
of these classes coincide with the class of tree-interpretable graphs (see [17]). 


Theorem 3.6.5. For any graph G = (V, (Ea)aca), the following are equiva- 
lent: 


(i) G is tree-interpretable. 
(ti) G is VR-equational. 
(iii) G is prefix-recognizable. 
(iv) G is the restriction to a regular set of the configuration graph of a push- 
down automaton with e-transitions. 


On the other hand, the classes of context-free graphs and of HR-equational 
graphs are strictly contained in the class of tree-interpretable graphs. 
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Exercise 3.6.6. Prove that every tree-interpretable structure is automatic. Is 
the converse also true? 


Tree-Constructible Structures: the Caucal Hierarchy 


The question arises of whether there are even more powerful domains than 
the tree-interpretable structures on which monadic second-order logic is effec- 
tive. An interesting way to obtain such domains is to use tree constructions 
that associate with any structure a kind of tree unravelling. A simple vari- 
ant is the unfolding of a labelled graph G from a given node v to the tree 
T(G, v). Courcelle and Walukiewicz [28, 29] have shown that the MSO-theory 
of T(G,v) can be effectively computed from the MSO-theory of (G,v). A 
more general operation, applicable to relational structures of any kind, has 
been invented by Muchnik. Given a relational structure % = (A, Ri,...,Rm), 
let its iteration 2* = (A*, Ri,..., Rž., suc, clone) be the structure with uni- 
verse A*, relations RF = {(wa1,..., war) : w E€ A*,(ai,...,a-) E€ Ri}, the 
successor relation suc = {(w, wa) : w E€ A*,a € A}, and the predicate clone 
consisting of all elements of the form waa. It is not difficult to see that un- 
foldings of graphs are first-order interpretable in their iterations. Muchnik’s 
Theorem states that the monadic theory of 2* is decidable if the monadic 
theory of XA is so (for proofs, see [11, 101]). We define the domain of tree- 
constructible structures to be the closure of the domain of finite structures 
under (one-dimensional) MSO-interpretations and iterations. By Muchnik’s 
Theorem, and since effective MSO model checking is preserved under inter- 
pretations, the tree constructible structures are finitely presentable and admit 
effective evaluation of MSO-formulae. 

The tree-constructible graphs form the Caucal hierarchy, which was de- 
fined in [26] in a slighly different way. The definition is easily extended to 
arbitrary structures: let Co be the class of finite structures, and let Cn+1 be 
the class of structures that are interpretable in the iteration A* of a structure 
X € Cn. There are a number of different, but equivalent, ways to define the 
levels of the Caucal hierarchy. For instance, one can use the inverse ratio- 
nal mappings given in [25] rather than monadic interpretations, and simple 
unfoldings rather than iterations without changing the hierarchy [24]. Equiv- 
alently, the hierarchy can be defined via higher-order pushdown automata. It 
is known that the Caucal hierarchy is strict, and that it does not exhaust the 
class of all structures with a decidable MSO-theory. We refer to [24, 98] for 
details and further information. 


3.6.3 Metafinite Structures. 


The class of infinite structures for which descriptive complexity theory has 
been studied most intensively is the class of the metafinite structures, pro- 
posed by Gradel and Gurevich [48], and studied also in [30, 49, 53, 84]. These 
structures are somewhat reminiscent of the two-sorted structures that we used 
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to define fixed-point logic with counting, (IFP + C). There, the second sort 
was a finite linear order ({0,..., n}, <). Metafinite structures are similar two- 
sorted structures, with the essential differences that (1) the numerical sort 
need not be finite, (2) the structures may contain functions from the first to 
the second sort, and (3) operations more general than counting are considered. 


Definition 3.6.7. A (simple) metafinite structure is a triple D = 
(A, R, W) consisting of the following: 


(i) A finite structure 2, called the primary part of Ð. 

(ii) A finite or infinite structure R, called the secondary (or numerical) part 
of D. We always assume that % contains two distinguished elements 0 
and 1 (or true and false). 

(iti) A finite set W of functions w : A? — R. 


The vocabulary of D is the triple t(D) = (Ta, Tr, Tw), Where each compo- 
nent of 7(D) is the set of relation or function symbols in the corresponding 
component of D. (We always consider constants as functions of arity 0.) The 
two distinguished elements 0, 1 of R are named by constants of 7;.. 


Example 3.6.8. (R-structures) The descriptive complexity theory over the 
real numbers developed by Gradel and Meer [53] (see Sect. 3.6.5) is based on 
R-structures, which are simple metafinite structure with a secondary part K = 
(R, +,—,:,/, <, (c-)rer). It is convenient to include subtraction and division 
as primitive operations and assume that every element r € R is named by a 
constant c,, so that any rational function g : Rë — R (i.e. any quotient of two 
polynomials) can be written as a term. 


There are many variations of metafinite structures. An important one is 
metafinite structures with multiset operations. Any function f : A > R 
defines a multiset mult(f) = { f(a) : a € A} over R (where the notation {...} 
indicates that we may have multiple occurrences of the same element). For 
any set R, let fm(R) denote the class of all finite multisets over R. In some 
of the metafinite structures that we consider, the secondary part SX is not 
just a (first-order) structure in the usual sense, but instead it comes with a 
collection of multiset operations I : fm(R) — R, mapping finite multisets 
over R to elements of R. Some natural examples on, say, the real numbers 
are addition, multiplication, counting, mean, maximum, and minimum. The 
use of multiset operations will become clearer when we introduce logics for 
metafinite structures. Let us just remark that multiset operations are a natural 
way to make precise the notion of aggregates in database query languages such 
as SQL. 


Example 3.6.9. (Arithmetical structures). Of particular interest to us are 
metafinite structures, whose secondary part is a structure St over the nat- 
ural numbers such that 
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e Stincludes at least the constants 0, 1, the functions +, -, the ordering rela- 
tion <, and the multiset operations max, min, X` (sum), and |] (product). 

e All functions, relations, and multiset operations of N can be evaluated in 
polynomial time. 


We call metafinite structures of this kind arithmetical structures. A simple 
arithmetical structure is obtained from an arithmetical structure by omitting 
the multiset operations. 


By itself, the notion of metafinite structures contains nothing revolution- 
ary: they are just a special kind of two-sorted structures. The interesting 
feature of metafinite model theory is not just the structures themselves, but 
the logics, which access the primary and the secondary part in different ways 
and are designed so that the approach and methods of finite model theory 
remain meaningful and applicable. An important feature of these logics is 
that they contain, besides formulae and terms in the usual sense, a calculus 
of weight terms from the primary to the secondary part. 


Definition 3.6.10. Let L be any of the logics for finite structures, such as 
FO, LFP, ...as described in the previous sections, and let T = (Ta, Tr, Tw) be a 
vocabulary for metafinite structures (where 7, may or may not have names for 
multiset operations). The appropriate modification of L for reasoning about 
metafinite structures D = (2, R, W) of vocabulary 7 is defined as follows. We 
fix a countable set V = {z0,21,...} of variables ranging over elements of the 
primary part A only. The point terms (defining functions f : A* — A), the 
weight terms (defining functions w : A* — R), and the formulae (defining 
relations R C A*) of L|r] are defined inductively as follows: 


(1) Point terms are defined in the usual way, by closing the set of variables 
V under application of function symbols from Ta. 

(2) Weight terms can be built by applying weight function symbols from Tw 
to point terms, and function symbols from 7, to previously defined weight 
terms. Note that there are no variables ranging over R. 

(3) Atomic formulae are equalities of point terms, equalities of weight terms, 
expressions Pt,---t, containing relations symbols P € Ta and point 
terms t,,...,t,p, or expressions Qf, --- fr containing predicates Q E Ty 
and weight terms fi,..., fr. 

(4) All the rules of L for building formulae (via propositional connectives, 
quantifiers, and other operators) may be applied, taking into account the 
condition that only variables from V may be used. 

(5) In addition, we have the characteristic function rule: if y(Z) is a 
formula, then y[y](Z) is a weight term. 

(6) If Tu contains multiset operations, these provide additional means for 
building new weight terms. Let F'(%, 7%) be a weight term, y(%, Y) a formula 
(both with free variables among 7,9), and I a multiset operation. The 
expression 

T(F(T,7) : 9) 
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is then a weight term with free variables J. (If y = true, we simplify this 
notation to IF (Z,%).) 


The semantics for (1)—(4) is the obvious one. A term x[y](%) evaluates 
to 1 if Y(T) is true, and to 0 otherwise. Finally, let G(Y) be a weight term 
Iz(F(Z,9) : p) formed by application of a multiset operation. The weight 
term F(%,Y) defines, on a metafinite structure D = (A,R, W), a function 
F? ; Ak+™ _, R. For any fixed tuple b, the collection of values F? (a, b), as @ 
ranges over those tuples such that y(@, b) is true, forms a finite multiset 


(F : y)®(b) := {F? (a,b) : a € A! such that D H y(G,b)}. 


The interpretation of G(b) on D is obtained by applying I’ to this multiset, 
ie. 
G? (b) := T((F : p)? (6). 


Example 3.6.11. (Binary representations.) Consider arithmetic structures 
with a primary part of the form 2% = ({0,...,n — 1},<, P) where P is a 
unary relation. P is interpreted as a bit sequence uo ---un—1 representing the 
natural number 37") u:2' (where u; = 1 iff A / P(i)). The number repre- 
sented by P is definable by the term 


Y (xP J[e:v< x)). 


T y 


Example 3.6.12. (Counting elements.) On arithmetic structures, first-order 
logic can count. For any formula y(Z), there is a weight term #z|y(Z)| count- 
ing the number of tuples @ such that y(@) is true, namely 


#2le(2)] = A xig. 


3.6.4 Metafinite Spectra 


Does descriptive complexity theory generalize in a meaningful way from finite 
to metafinite structures? To give some evidence that such generalizations are 
indeed possible and fruitful, we focus here on generalizations of Fagin’s The- 
orem to (1) arithmetical structures, and (2) R-structures (see the examples 
given above). 

Recall that Fagin’s Theorem says that generalized spectra (or, equivalently, 
the properties of finite structures that are definable in existential second-order 
logic) coincide with the complexity class NP. To discuss possible translations 
to metafinite structures, we need to make precise two notions: 


e The notion of a metafinite spectrum, i.e. a generalized spectrum of 
metafinite structures. 

e The notion of complexity (in particular, deterministic and non- 
deterministic polynomial time) in the context of metafinite structures. 
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For a fixed structure IR, let 14,[9] denote the class of metafinite structures 
with a secondary part R and vocabulary T = (Ta, Tr, Tw) (where, of course, Tr 
is the vocabulary of R). We start with two notions of metafinite spectra. 


Definition 3.6.13. A class K C M,[] is a metafinite spectrum if there 
exists a first-order sentence w of a vocabulary T’ D r such that D € K if and 
only if there exists an expansion D’ € M,/[R] of D with D’ H y. (Note that 
the secondary part is not expanded.) A primary metafinite spectrum is 
defined in a similar way, except that only the primary part of the structures is 
expanded, and not the set of weight functions. This means that the expanded 
structures D’ have the same set of weight functions as D. 


These two notions of metafinite spectra correspond to two variants of ex- 
istential second-order logic. The more restrictive variant allows second- 
order quantification over primary relations only, whereas the general one al- 
lows quantification over weight functions as well. Thus, a primary metafinite 
spectrum is the class of structures D € M-[9] which are models of an ex- 
istential second-order sentence of the form 4R,---AIR mw, where Ri,..., Rm 
are relation variables over the primary part, and w is first-order. Since rela- 
tions over the primary part can be replaced by their characteristic functions, 
a metafinite epee in the more general sense is the class of models of a 
sentence JF; - - - JFmY, where the F; are function symbols ranging over weight 
functions. We shall see that both notions of metafinite spectra capture (suit- 
able variants of) non-deterministic polynomial-time in certain contexts, but 
fail to do so in others. 

In general, the notion of complexity for problems on metafinite strucures 
depends on the computation model used and on the cost (or size) associated 
with the elements of the secondary part. For instance, if the secondary part 
consists of natural numbers or binary strings, then a natural notion of cost 
is given by the number of bits. On the other hand, below we shall study 
complexity over real numbers with respect to the Blum—Shub-Smale model, 
and there every element of R will be treated as a basic entity of cost one. 

Let ||r|| denote the cost of r. For a metafinite structure D = (A, R, W) € 

M- [XR], let |D| := |X] and let maxD := maxyew maxz||w(G)||, the cost of 
the maximal weight. Assuming and 7 to be fixed, then ||®||, the cost of 
representing D, is polynomially bounded in |X| and max D (via a polynomial 
that depends only on the vocabulary of D). Since most of the popular com- 
plexity classes are invariant under polynomial increase of the relevant input 
parameters, it therefore makes sense to measure the complexity in terms of 
|D| and max ®. For instance, an algorithm on a class of metafinite structures 
runs in polynomial time or in logarithmic space if, for every input 9, the 
computation terminates in at most q(|D|, max D) steps, for some polyootaial 
q, or uses at most O(log |D| + log max D) of work space, respectively. 

We first discuss arithmetical structures, as described in Example 3.6.9, 
assuming that the cost of natural numbers is given by the length of their binary 
representations. So the question is whether, or under what circumstances, NP 
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is captured by the class of metafinite spectra or primary metafinite spectra. 
The original proof of Fagin’s Theorem generalizes to the case of arithmetical 
structures with weights that are not too large. 


Definition 3.6.14. A class K of metafinite structures has small weights if 
there exists a k € N such that max D < |D|* for all D € K. As max D stands 
for the cost of the largest weight this means that the values of the weights are 
bounded by a function 2?('®!) for some polynomial p. 


We obtain the following first generalization of Fagin’s result. 


Theorem 3.6.15 (Gradel and Gurevich). Let K C M,[9] be a class of 
arithmetical structures with small weights which is closed under isomorphisms. 
The following are equivalent: 


(i) K is in NP. 
(ti) K is a primary generalized spectrum. 


Proof. It is obvious that (ii) implies (i). The converse can be reduced to Fa- 
gin’s Theorem as follows. We assume that for every structure D = (2, N, W) 
in K, we have that maxD < n*, where n = |D| = ||; further, we suppose 
without loss of generality, that an ordering < on A is available (otherwise we 
expand the vocabulary with a binary relation < and add a conjunct ((<) 
asserting that < is a linear order). We can then identify A* with the initial 
subset {0,..., n} — 1} of N, viewed as bit positions of the binary representa- 
tions of the weights of D. With every D € K we associate a finite structure D p 
by expanding the primary part 2 as follows: for every weight function w € W 
of arity j, we add a new relation P,, of arity j + k, where 


Py := {(G,8) : the Tth bit of w(@) is 1}. 


Then X is in NP if and only if Cs = {Ds : D € K} is an NP-set of finite 
structures, and, in fact, we can choose the encodings in such a way that D 
and 9 f are represented by the same binary string. Thus, if X is in NP, then, 
by Fagin’s Theorem, Ky is a generalized spectrum, defined by a first-order 
sentence w. 

As in Example 3.6.11, one can construct a first-order sentence a (whose 
vocabulary consists of the weight functions w E€ Tw and the corresponding 
primary relations P,,) which expresses the assertion that the P,, encode the 
weight functions w in the sense defined above. Then ~# A a is a first-order 
sentence witnessing that K is a primary metafinite spectrum. 


The above result also holds for arithmetical structures without multiset 
operations. However, without the restriction that the weights are small, it is 
no longer true that every NP-set is a primary metafinite spectrum. If we have 
inputs with huge weights compared with the primary part, then relations over 
the primary part cannot code enough information to describe computations 
that are bounded by a polynomial in the length of the weights. 
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It is tempting to use unrestricted metafinite spectra instead. However, 
metafinite spectra in the general sense capture a much larger class than NP. 


Theorem 3.6.16 (Gradel and Gurevich). On arithmetical structures, 
metafinite spectra capture the recursively enumerable sets. 


We sketch the proof here. It is not difficult to show that every metafinite 
spectrum of arithmetical structures is recursively enumerable. For the con- 
verse, we first note that any tuple @ € N* can be viewed as an arithmetical 
structure with an empty primary vocabulary and k nullary weight functions 
1,...,@p. Thus an arithmetical relation S C N* can be viewed as a special 
class of arithmetical structures. We show first that every recursively enumer- 
able set S C N* is a metafinite spectrum. In particular, there exist undecidable 
metafinite spectra. 

By Matijasevich’s Theorem (see [83]), every recursively enumerable set 
S C N* is Diophantine, i.e. can be represented as 


S = {ū € NÝ : there exists b,,...,bm € N such that Q(ā, b) = 0} 


for some polynomial Q € Z|z1,..., £k, Y1; ---;Ym]. Let P,P’ € Nz, y] such 
that Q(T, 7) = P(z,7)— P'(z,7). Thus S is a metafinite spectrum; the desired 
first-order sentence uses additional weight functions b1,...,bm and asserts 
that P(a,b) = P'(G, 6). 

This can be extended to any recursively enumerable class of arithmetical 
structures, with an arbitrary vocabulary. To see this, we encode structures 
D C M,[M] by tuples c(D) € N*, where k depends only on 7. (In fact, it is 
no problem to reduce k to 1.) Similarly to the case of finite structures, an 
encoding involves the selection of a linear order on the primary part. In fact, 
it is often more convenient to have a ranking of the primary part rather than 
just a linear ordering. 


Definition 3.6.17. Suppose that contains a copy of (N, <). A ranking of 
a metafinite structure D = (A, R, W) is a bijection r: A —> {0,...,n— 1} C 
R. A class K C M,[] is ranked if 7 contains a weight function r whose 
interpretation on every D € K is a ranking. 


The Coding Lemma for arithmetical structures [48] says that for every 
vocabulary 7 there exists an encoding function that associates with every 
ranked arithmetical 7-structure D a tuple code(D) € N* with the following 
properties: 


(1) code is definable by first-order terms. 

(2) The primary part and the weight functions of D can be reconstructed 
from code(D) in polynomial time. 

(3) There exists a polynomial p(n,m) such that (D) < 2?(/®lmax®) for 
every i < k. 
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Now let K C M,[M] be recursively enumerable. The set 
code(K) := {code(D,r): D € K, ris a ranking of D} C N” 


is then also recursively enumerable and therefore Diophantine. The desired 
first-order sentence w uses, besides the symbols of 7, a unary weight function 
r and nullary weight functions b1, ...,bm and expresses the assertions (i) that 
r is a ranking and (ii) that Q(code(D,r),b)) = 0 for a suitable polynomial 
Q € Zlai,..., £k, Y1,- -, Ym] defining code(K). 


3.6.5 Descriptive Complexity over the Real Numbers 


There are other contexts in which metafinite spectra do indeed capture (a 
suitable notion of) non-deterministic polynomial time. An important example 
are computations over the real numbers based on the model of Blum, Shub, 
and Smale. 


Computation over R 


In 1989 Blum, Shub, and Smale [15] introduced a model for computations 
over the real numbers (and other rings as well), which is now usually called 
the BSS machine. The important difference from, say, the Turing model is 
that real numbers are treated as basic entities and that arithmetic operations 
on the reals are performed in a single step, independently of the magnitude or 
complexity of the numbers involved. In particular, the model abstracts from 
the problems that in actual computers real numbers have to be approximated 
by bit sequences, that the complexity of arithmetic operations depends on the 
length of these approximate representations, that rounding errors occur, and 
that exact testing for 0 is impossible in practice. Similar notions of computa- 
tions over arbitrary fields or rings had been investigated earlier in algebraic 
complexity theory (see [22] for a comprehensive treatment). A novelty of the 
approach of Blum, Shub, and Smale is that their model is uniform (for all 
input lengths) whereas the ideas explored in algebraic complexity (such as 
straight-line programs, arithmetic circuits, and decision trees) are typically 
non-uniform. One of the main purposes of the BSS approach was to create a 
uniform complexity theory dealing with problems that have an analytical and 
topological background, and to show that certain problems remain hard even 
if arbitrary reals are treated as basic entities. 

Many basic concepts and fundamental results of classical computability 
and complexity theory reappear in the BSS model: the existence of universal 
machines, the classes Pg and NPp (real analogues of P and NP), and the 
existence of NPgR-complete problems. Of course, these ideas appear in a dif- 
ferent form, with a strong analytical flavour: typical examples of undecidable, 
recursively enumerable sets are complements of certain Julia sets, and the 
first problem that was shown to be NPr-complete is the question of whether 
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a given multivariate polynomial of degree four has a real root [15]. As in the 
classical setting, all problems in the class NPp are decidable within exponen- 
tial time (but this is not as trivial as in the classical case), and the Pp versus 
NPR question is one of the major open problems. 

However, there also are many differences between classical and real com- 
plexity theory. Just to mention a few, we note that the meaning of space 
resources seems to be very different, that certain separation results between 
complexity classes can be established (such as NCr Ç Pr and NPr G EXPR) 
whose analogues in the classical theory are open, and that some discrete prob- 
lems seem to change their complexity behaviour when considered in the BSS 
model. For a detailed treatment we refer the interested reader to the book [14]. 


The BSS Model 


Let R* := Uken R*, or (almost) equivalently, the set of functions X : N— R 
with X(n) = 0 for all but finitely many n. For any X € R*, we call |X| := 
max{n: X(n) # 0} the length of X. Note that R* x R* can be identified with 
R* in a natural way by concatenation. A Blum—Shub—Smale machine -in 
what follows called a BSS machine — is essentially a Random Access Machine 
over R which can evaluate rational functions at unit cost and whose registers 
can store arbitrary real numbers. 


Definition 3.6.18. A BSS machine M over R is given by a finite set I of 
instructions labelled by 0,..., N. The input and output spaces are subsets of 
R*. A configuration is a quadruple (k,r,w,z) € I x N x N x R*, where k is 
the instruction currently being executed, r and w are the numbers of the so 
called ‘copy registers’ (see below) and x describes the content of the registers 
of the machine. Given an input x € R*, the computation is started with a 
configuration (0,0,0, x). If a configuration (k,r,w,x) with k = N is reached, 
the computation stops; in that case the value of x is the output computed by 
the machine. The instructions of M are of the following types: 


e Computation. An instruction k of this type performs an update £o — g(x) 
of the first register, where gẹ is a rational function on R™ (for some m). 
Simultaneously, the copy registers may be updated by rules r — r+ 1 or 
r — 0, and similarly for w. The other registers remain unchanged. The 
next instruction will be k +1. 

e Branch. k: if zo > 0 goto £ else goto k+ 1. The contents of the registers 
remain unchanged. 

e Copy. k: fy <— zr, ie. the content of the ‘read register’ is copied into the 
‘write register’. The next instruction is k + 1; all other registers remain 
unchanged. 


A set L C R* is in Pp if there exists a BSS machine whose running time 
on every X € R* is bounded by a polynomial in |X|, and which accepts X if 
and only if X € L. The analogue of NP is the class NPg. A set L C R* is in 
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NPr if there exists a set L’ € Pg and a constant k such that L = {X € R*: 
(AY € R*)(\Y| < |X| A (X,Y) € L’)}. Equivalently, NPr can be defined as 
the class of problems over R* that are decidable in polynomial time by a non- 
deterministic BSS machine, i.e. a BSS machine that can non-deterministically 
guess real numbers Y € R at unit cost. 


u 


Encodings. Recall that R-structures are metafinite structures D = 
(A, R, W) with a second sort R = (R,+,—,-,/,<, (cr)rer). We want to relate 
decision problems for R-structures (described by logical formulae) to decision 
problems on R* (decided by BSS-machines). We first consider an example. 


Example 3.6.19. (4-Feasibility.) The first problem that was shown to be NPg- 
complete was the problem of whether a real polynomial of degree at most four 
in n unknowns (where n varies with the input) has a real zero. This problem 
can be considered as a decision problem on R-structures as follows. Let A = 
{0,...,n}. The coefficients of a homogeneous polynomial g € R[Xo,..., Xn] 
can be coded via a function C : A4 — R, such that 


g= Š. OGG, k, OXiX;X:Xı. 
O<i, j,k b<n 


We obtain an arbitrary (not necessarily homogeneous) polynomial f € 
R[X1,...,Xn] of degree four by setting Xo = 1 in g. Thus, every multi- 
variate polynomial f of degree at most four is represented by the R-structure 
(A,R, {C}), where A = ({0,...,n},<,0,n) and C is a function from A?* into 
R. 


Observe that R* can be viewed as the class of all R-structures where the 
primary part is a finite linear order ({0,...,n — 1}, <), and W consists of a 
single unary function X : {0,...,n — 1} — R. Hence decision problems on 
R* can be regarded as a special case of decision problems on R-structures (in 
the same way as words can be considered as special cases of finite structures). 
Conversely, R-structures D = (A, R, W) can be encoded in R*. We choose a 
ranking on A and replace all functions and relations in the primary part by 
the appropriate characteristic functions y : A* — {0,1} C R. This gives a 


structure whose primary part is a plain set A, with functions X1,..., X+ of 
the form X; : A* — R and with the ranking r : A — R. Each of the functions 
X; can be represented by a tuple £o, ...,£m-1 € R™, where m = |A|* and 


x; = X(a(i)), and where @(i) is the ith tuple in A* with respect to the 
lexicographic order induced by r. The concatenation of these tuples gives an 
encoding code(®,r) € R* (which depends on the ranking r that was chosen). 

Obviously, for structures D of a fixed finite signature, the length of 
code(D,r) is bounded by some polynomial nf, where n = |D| and £ depends 
only on the signature. Thus we can also view code(D, r) = (£o, ...,Zne—1) as 
a single function Xo : A‘ — R, where X(G@(i)) = a; for all i < nf. Thus, en- 
coding an R-structure in R* basically means representing the whole structure 
by a single function (of appropriate arity) from {0,...,n — 1} into R. 
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Furthermore, this encoding is first-order definable in the following sense. 


Lemma 3.6.20. For every signature T, there is a first-order formula 3(X,r) 
of signature T U{X,r} such that, for all R-structures D of signature T, for all 
rankings r, and for all functions X, 


(D,X,r) = G(X,r) iff X =code(D,r). 


As in the case of finite structures, we say that a class K of R-structures is 
in the complexity class Pp or NP if the set of its encodings is. Recall that a 
metafinite spectrum of R-structures is a set K of R-structures that is definable 
by an existential second-order sentence SY, -- - SY,W, where w is first-order and 
the variables Y; range over weight functions Y; : A* — R. Fagin’s Theorem 
now has the following analogue in the real setting. 


Theorem 3.6.21 (Gradel and Meer). Let K be a class of R-structures. 
Then K € NPr if and only if K is a metafinite spectrum. 


Proof. It is easy to see that metafinite spectra are in NPpg. Suppose that 
= 4dY,---Y,y. Given an input structure D, we guess assignments for all 
functions Y; and evaluate y on (9, Yi,..., Y;) in polynomial time. 

For the converse, let K € NP and let K’ be the corresponding problem 
in Pr, with K = {9 : SY((D,Y) € K’)}. Let M be a polynomial-time BSS 
machine deciding K’, and let m be a natural number such that M stops on 
encodings of (D, Y) after less than n™ steps and uses at most n™ —3 registers, 
where n = |Ð]. 

We first suppose that we have a ranking r : A — R available. From r, the 
induced (lexicographic) ranking rm : A™ — R is first-order definable: we can 
identify the element in A of maximal rank and thus have the number term n 
available; we can then use r,(#) as an abbreviation for 


r(ty)n™ 1 + +++ r(tm—1)n +r (tm). 


We can then identify A™ with the initial subset {0,...,n’’—1} of N. Thus, in 
the formulae to be constructed below, m-tuples t = t1,...,tm of variables are 
considered to range over natural numbers t < n™. Conditions such as ¢ = 0 
or t= 5+7 can then be expressed by first-order formulae of vocabulary {r}. 

The computation of M for a given input code(D,Y) can be represented 
by a function Z : A?” — R as follows: 


Z(0,#) is the instruction executed by M at time t. 
Z(1,t) and Z(2,7) are the indices of the read and write registers of M at 
time f. 
e Z(j+3,#) is the content of register j at time t. 
We construct a first-order formula Y with the property that, for all ranked 
structures (D,Y) and all Z, we have that (D,Y, Z) = w iff Z represents an 
accepting computation of M for code(D,Y). 
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We first have to express the assertion that at time t = 0, the function Z 
encodes the input configuration of M on (D,Y). Thus we need a subformula 
stating that Z(i,0) = 0 for i = 0,1,2 and that the values 7(7 + 3,0) encode 
the input (D,Y). By Lemma 3.6.20 this can be expressed in first-order logic. 

Second, we have to ensure that for every t < n™ — 1, if the sequence 
(Z(j,#) : j = 0,...,n™ — 1) represents a configuration of M, then the se- 
quence of values Z(j, t+1) represents the successor configuration. The formula 
asserting this has the form 


N 
VE \ (ZO, =k > yr) 


k=0 


where vy, describes transitions performed by the instruction k. 

Consider for example a computation instruction k : xo — g(zo,..., £e), 
and assume in addition that it increases the index of the read register by 1 
and sets the index of the write register back to 0. The formula Yẹ then has 
to express the following: 


e Z(0,f+1) =k+1 (the next instruction is k + 1); 

e Z(1,f+1) = Z(1,¢) +1 (the read register index is increased by 1); 

e Z(2,f+1) =0 (the write register index is set back to 0); 

e Z(3,t+ 1) = g(Z(3,t), Z(4,t),..., Z(£ + 3,t)) (into register 0, M writes 
the result of applying the rational function g to the register contents at 


time ț). 
e Z(j,t+1)= Z(j,t) for all f > 3 (the other registers remain unchanged). 


Clearly, these conditions are first-order expressible. It should be noted that 
whenever fo,...,f¢ are number terms and g : Rf — R is a rational function, 
then g(fo,..-, fe) is also a number term. 

For another example illustrating the explicit use of the embedding func- 
tion, consider a copy instruction k : £w < £r. Here the formula has to express 
(besides the updating of the instruction number, etc. which is done as above), 
the assertion that the content of the register Z(2, t) at time t+ 1 is the same as 
the content of the register Z(1,¢) at time ¢. This is expressed by the formula 


YNT ([Z(1,8) = rm(F) A Z(2,8) = mF )] 
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To express the assertion that M accepts its input, we just have to say that 
Z(3,n™ —1) = 1 (by convention, the result of the computation, if it is a single 
number, is stored in register 0). 

Combining all these subformulae in the appropriate way, we obtain the 
desired formula ~w. It then follows that for all structures 9, 


DEK iff DK (SY)(AZ)v, 
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which proves the theorem for the case of ranked structures. 

Finally, we do away with the assumption that the input structures are 
ranked. If no ranking is given on the input structures D, we can introduce one 
by existentially quantifying over the function r and adding a conjunct a(r) 
which asserts that r is one-one and that, for all t with r(t) 4 0, there exists 
an element s such that r(s) + 1 = r(t). It follows that 


K = {D : D E (ar)(AY)AZ)(an 4%)}. 


Example 3.6.22. (Logical description of 4-Feasibility.) An existential second- 
order sentence for the 4-feasibility problem quantifies two functions X : A > R 
and Y : A* — R where X(1),...,X(n) describes the zero and Y(u) is the 
partial sum of all monomials up to u € Af in f(X1,..., Xn) (according to the 
lexicographical order on A*). Thus the 4-feasibility problem is described by 
the sentence 


bis (Ax)(3Y)( Y(0) =CO)AY(@) =0AVa(a 40 > 
¥(@) =¥(@-1) +C@ IT, X(ui))). 


Indeed, D — w if and only if the polynomial f of degree four defined by D 
has a real zero. 


Capturing Results for Other Complexity Classes 


By combining the general ideas of descriptive complexity theory on finite 
structures with the approach described here, one can find logical characteriza- 
tions for many other complexity levels, notably for polynomial time, provided 
that the given R-structures are ranked (i.e. an ordering on the finite part is 
available). This is carried out in some detail in [30, 53]. 
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3.7 Appendix: Alternating Complexity Classes 
Alternating algorithms are a generalization of non-deterministic algorithms, 


based on two-player games. Indeed, one can view non-deterministic algorithms 
as the restriction of alternating algorithms to solitaire (i.e. one-player) games. 
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Since complexity classes are mostly defined in terms of Turing machines, we 
focus on the model of alternating Turing machines. But note that alternating 
algorithms can be defined in terms of other computational models, also. 


Definition 3.7.1. An alternating Turing machine is a non-deterministic 
Turing machine whose state set Q is divided into four classes Q3 , Qy , Qacc, 
and Q;e;. This means that there are existential, universal, accepting and re- 
jecting states. States in Qacc U Qrej are final states. A configuration of M is 
called existential, universal, accepting, or rejecting according to its state. 


The computation graph Gy, of an alternating Turing machine M for an 
input x is defined in the same way as for a non-deterministic Turing machine. 
Nodes are configurations (instantaneous descriptions) of M, there is a distin- 
guished starting node Co(x) which is the input configuration of M for input z, 
and there is an edge from configuration C to configuration C” if, and only if, 
C’ is a successor configuration of C. Recall that for non-deterministic Turing 
machines, the acceptance condition is given by the REACHABILITY problem: 
M accepts « if, and only if, in the graph Gy, some accepting configuration 
Ca is reachable from Co(x). For alternating Turing machines, acceptance is 
defined by the GAME problem (see Sect. 3.1.3): the players here are called 
J and VY, where 3 moves from existential configurations and V from universal 
ones. Further, 4 wins at accepting configurations and loses at rejecting ones. 
By definition, M accepts x if, and only if, Player 3 has a winning strategy 
from Co(x) for the game on Gyy,z. 


Complexity Classes 


Time and space complexity are defined as for nondeterministic Turing ma- 
chines. For a function F : N — R, we say that an alternating Turing machine 
M is F-time-bounded if for all inputs x, all computation paths from Co(s) 
terminate after at most F(|z|) steps. Similarly, M is F-space-bounded if no 
configuration of M that is reachable from Co(x) uses more than F'(|2|) cells 
of work space. The complexity classes ATIME(F) and ASPACE(F’) contain 
all problems that are decidable by, respectively, F-time bounded and F-space 
bounded alternating Turing machines. 
The following classes are of particular interest: 


ALOGSPACE = ASPACE(O(log n)), 
APTIME = Ugen ATIME(n”), 
© APSPACE = Ujen ASPACE(n*). 


Alternating Versus Deterministic Complexity 


There is a general slogan that parallel time complexity coincides with sequen- 
tial space complexity. Indeed, by standard techniques of complexity theory, 
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one can easily show that, for well-behaved (i.e. space-constructible) func- 
tions F, ATIME(F) C DSPACE(F?) and DSPACE(F) C NSPACE(F) C 
ATIME(F”) (see [9] for details). In particular, 


e APTIME = PSPACE; 
e AEXPTIME = EXPSPACE. 


On the other hand, alternating space complexity corresponds to exponen- 
tial deterministic time complexity. 


Theorem 3.7.2. For any space-constructible function F(n) > logn, we have 
that ASPACE(F) = DTIME(2°)), 


Proof. The proof is closely associated with the GAME problem. For any F- 
space-bounded alternating Turing machine M, one can, given an input zx, 
construct the computation graph G'yy,, in time 2°(F (le) and then solve the 
GAME problem in order to decide the acceptance of x by M. 

For the converse, we shall show that for any G(n) > n and any constant 
c, DTIME(G) C ASPACE(c: log G). 

Let L € DTIME(G). There is then a deterministic one-tape Turing ma- 
chine M that decides L in time G°. Let Pr = YU(Q x X)U{*} and t = G?(n). 
Every configuration C = (q,i,w) (in a computation on some input of length 
n) can be described by a word 

c= «wo +++ Wi-1(qui) Wig + we E TYP. 

The ith symbol of the successor configuration depends only on the symbols 
at positions i — 1, i, and i+ 1. Hence, there is a function fm : T’ —> I such 
that, whenever symbols a_,, ap, and a, are at positions i — 1,7 andi+1 of 
some configuration c, the symbol fm(a—1,a0, a1) will be at position i of the 
successor configuration c’. 

The following alternating algorithm A decides L: 


Input: x 
Existential step: guess s < t, 
guess (qa) € Qace X X , i € {0,..., 8} 
b := (qra) 
for j= 1...s do 
Existential step: guess a_1, ao, aı € T’ 
verify that fm(a—1,a0,a1) = b. If not, reject. 
Universal step: choose k € {—1,0,1} 
b := ak 
i:=i+k 
od 
if ith symbol of input configuration of M on x equals b then accept 
else reject. 
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The algorithm A needs space O(log G(n)). If M accepts the input x, then 
Player d has the following winning strategy for the game on C'4,,: the value 
chosen for s is the time at which M accepts x, and (qa), i are chosen so that 
the configuration of M at time s is of the form xwo +++ wi—1(q7a)wip1 +++ Wee. 
At the jth iteration of the loop (that is, at configuration s — j), the symbols 
at positions 7 — 1,7,7 + 1 of the configuration of M at time s — j are chosen 
for a_1, ao, Q1. 

Conversely, if M does not accept the input x, the ith symbol of the config- 
uration at time s is not (qta). The following holds for all j: if, in the jth iter- 
ation of the loop, Player I chooses a_1, ao, a1, then either f(a_1, a0, a1) Æ b, 
in which case Player J loses immediately, or there is at least one k € {—1,0, 1} 
such that the (i + k)th symbol of the configuration at time s — j differs from 
ay. Player V then chooses exactly this k. At the end, a, will then be different 
from the ith symbol of the input configuration, so Player V wins. 

Hence A accepts «x if, and only if, M does so. 


In particular, it follows that 


e ALOGSPACE = PTIME; 
e APSPACE = EXPTIME. 


The relationship between the major deterministic and alternating com- 
plexity classes is summarized by the following diagram: 
LOGSPACE C PTIME C PSPACE C EXPTIME C EXPSPACE C... 


| | | | 
ALOGSPACE C APTIME C APSPACE C AEXPTIME C ... 


Alternating Logarithmic Time 


For time bounds F (n) < n, the standard model of alternating Turing machines 
needs to be modified a little by an indirect access mechanism. The machine 
writes down, in binary, an address 7 on an separate index tape to access the ith 
symbol of the input. Using this model, it makes sense to define, for instance, 
the complexity class ALOGTIME = ATIME(O(log n)). 


Example 3.7.8. Construct an ALOGTIME algorithm for the set of palin- 
dromes (i.e., words that are same when read from right to left and from 
left to right). 


Important examples of problems in ALOGTIME are 


e the model-checking problem for propositional logic; 
e the data complexity of first-order logic. 


The results mentioned above relating alternating time and sequential space 
hold also for logarithmic time and space bounds. Note, however, that these do 
not imply that ALOGTIME = LOGSPACE, owing to the quadratic overheads. 
It is known that ALOGTIME C LOGSPACE, but the converse inclusion is 
an open problem. 
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